Skip to content

docs(install): note aarch64 wheels are aarch64-sbsa, not L4T (Jetson)#1941

Merged
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
neil-the-nowledgeable:docs/jetson-l4t-note
May 7, 2026
Merged

docs(install): note aarch64 wheels are aarch64-sbsa, not L4T (Jetson)#1941
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
neil-the-nowledgeable:docs/jetson-l4t-note

Conversation

@neil-the-nowledgeable
Copy link
Copy Markdown
Contributor

Summary

Adds a short WARNING callout to docs/source/installation.mdx right after the Linux aarch64 row of the PyPI build-targets table, telling users that aarch64 wheels are built on aarch64-sbsa (server ARM with the standard CUDA Toolkit) — not the L4T / JetPack runtime that Jetson Orin / Xavier / Thor on CUDA 12 use. Points readers at the working options: on-device source build, or Jetson AI Lab's prebuilt index.

Why

Per @matthewdouglas's review on #1939 and the original report at #1930: the failure mode is a CUDA symbol-resolution error from the toolchain mismatch (aarch64-sbsa CUDA libs vs L4T-bundled CUDA libs), not a missing-arch kernel-image issue. sm_80 cubins are binary-compatible with sm_87 hardware via Ampere-family binary compat — empirically confirmed in the #1939 thread (built bnb with -DCOMPUTE_CAPABILITY=80 only, verified cuobjdump --list-elf shows only sm_80.cubin, ran cleanly on sm_87 Jetson Orin).

Documenting the workaround is the lowest-effort thing that helps the next person hitting the symbol-not-found error find the right answer (source-build or JAL prebuilt) without chasing the arch list as I originally did.

Scope

  • Single file: docs/source/installation.mdx
  • 8-line > [!WARNING] callout placed right after the Linux aarch64 PyPI row (line 67)
  • No build matrix changes, no test changes, no library code changes

Test plan

  • pre-commit run --files docs/source/installation.mdx passes (typos checker confirms no typos in the new prose; format hooks pass)
  • Markdown callout uses > [!WARNING] syntax matching existing usage in the same file

References

Adds a WARNING callout after the Linux aarch64 row of the PyPI build-targets
table, explaining that:

1. Wheels are built on aarch64-sbsa runners (standard CUDA Toolkit), not the
   L4T / JetPack runtime that Jetson Orin / Xavier / Thor (on CUDA 12) use.
2. The mismatch surfaces as 'Error named symbol not found in /src/csrc/ops.cu'
   on the first CUDA op — a symbol-resolution error, NOT a kernel-image-for-
   device error. The cubins ARE binary-compatible with the device per
   Ampere-family binary compat (sm_80 SASS runs on sm_87 hardware natively).
3. Working options on Jetson: on-device source build, or third-party prebuilt
   from Jetson AI Lab.

References bitsandbytes-foundation#1218 and bitsandbytes-foundation#1930 for the original error reports, and bitsandbytes-foundation#1939 for the
empirical confirmation that the fault is the toolchain delta, not the arch
list (sm_80-only cubin built on-device runs cleanly on sm_87 hardware).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@matthewdouglas matthewdouglas added the Documentation Improvements or additions to documentation label May 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@matthewdouglas matthewdouglas merged commit 300a70f into bitsandbytes-foundation:main May 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants