Skip to content

feat(llvm): add llvm19 support for compute_100+#375

Open
brandonros wants to merge 1 commit intoRust-GPU:mainfrom
brandonros:llvm19-cfg
Open

feat(llvm): add llvm19 support for compute_100+#375
brandonros wants to merge 1 commit intoRust-GPU:mainfrom
brandonros:llvm19-cfg

Conversation

@brandonros
Copy link
Copy Markdown

@brandonros brandonros commented Apr 14, 2026

attempt 2 of #227

@brandonros brandonros changed the title feat(llvm19): scaffold Layer 0 and record progress feat(llvm): add llvm19 support for compute_100+ Apr 14, 2026
@brandonros brandonros marked this pull request as ready for review April 14, 2026 21:58
@brandonros
Copy link
Copy Markdown
Author

@LegNeato this is a much cleaner approach, what do you think? can we see if CI passes?

@brandonros brandonros force-pushed the llvm19-cfg branch 10 times, most recently from 2b397fc to f53e57d Compare April 15, 2026 12:30
@brandonros
Copy link
Copy Markdown
Author

proof it works in a limited capacity?

$ ./scripts/vast-ai.sh 
>> Building on brandon@asusrogstrix.local
warning: Git tree '/home/brandon/Rust-CUDA' has uncommitted changes
rust-cuda llvm19 shell
  CUDA_HOME=/usr/local/cuda-13.2
  LLVM_CONFIG_19=/nix/store/a7rsrh7cdbc8vzv72j1vc7936d4mapqm-llvm-19.1.7-dev/bin/llvm-config
  NVIDIA_DRIVER_LIB=/home/brandon/Rust-CUDA/.nix-driver-libs/libcuda.so.1
warning: vecadd@0.1.0: Building rustc_codegen_nvvm to satisfy cuda_builder requirements
   Compiling vecadd v0.1.0 (/home/brandon/Rust-CUDA/examples/vecadd)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.69s
>> Staging binary locally
vecadd                                                                                                                                        100% 6980KB  12.9MB/s   00:00    
>> Uploading to root@ssh6.vast.ai:34929
vecadd                                                                                                                                        100% 6980KB  11.4MB/s   00:00    
>> Running on vast.ai
GPU 0: NVIDIA GeForce RTX 5070 (UUID: GPU-cd9e55d4-294f-8a32-1e79-b3c13506c2c8)
[vecadd] cust::quick_init ...
using 131 blocks and 768 threads per block
0.09988744 + 0.3485085 = 0.44839594
[vecadd] cust::quick_init ok
[vecadd] CudaApiVersion::get ...
[vecadd] CudaApiVersion::get ok
[vecadd] CUDA driver API version: 13.2
[vecadd] Device::get_device(0) ...
[vecadd] Device::get_device(0) ok
[vecadd] Device::get_attribute(ComputeCapabilityMajor) ...
[vecadd] Device::get_attribute(ComputeCapabilityMajor) ok
[vecadd] Device::get_attribute(ComputeCapabilityMinor) ...
[vecadd] Device::get_attribute(ComputeCapabilityMinor) ok
[vecadd] Device::name ...
[vecadd] Device::name ok
[vecadd] GPU: NVIDIA GeForce RTX 5070 (compute 12.0)
[vecadd] PTX size: 1320 bytes
[vecadd] PTX header: // | // Generated by NVIDIA NVVM Compiler | // | // Compiler Build ID: UNKNOWN | // Cuda compilation tools, release 13.2, V13.2.78 | // Based on NVVM 22.0.0 | // |  | .version 9.2 | .target sm_100
[vecadd] cuModuleLoadDataEx (with JIT log buffers) ...
[vecadd] cuModuleLoadDataEx raw result code: CUDA_SUCCESS
[vecadd] cuModuleLoadDataEx (with JIT log buffers) ok
[vecadd] Stream::new ...
[vecadd] Stream::new ok
[vecadd] DeviceBuffer::from lhs ...
[vecadd] DeviceBuffer::from lhs ok
[vecadd] DeviceBuffer::from rhs ...
[vecadd] DeviceBuffer::from rhs ok
[vecadd] DeviceBuffer::from out ...
[vecadd] DeviceBuffer::from out ok
[vecadd] Module::get_function("vecadd") ...
[vecadd] Module::get_function("vecadd") ok
[vecadd] suggested_launch_configuration ...
[vecadd] suggested_launch_configuration ok
[vecadd] launching kernel ...
[vecadd] launch queued ok
[vecadd] stream.synchronize ...
[vecadd] stream.synchronize ok
[vecadd] copy_to ...
[vecadd] copy_to ok

Add the initial llvm19 cargo/build.rs plumbing while preserving the llvm7\ncheck path. Assemble a v19 libintrinsics bitcode at build time and route\nnvvm.rs through the build-script-provided path.\n\nDocument the validated baseline on the current host and the first Layer 1\nblocker: the existing C++ shim no longer builds unchanged against LLVM 19\nbecause rustllvm.h still expects headers like llvm/ADT/Triple.h.

RUST_CUDA_ALLOW_LEGACY_ARCH_WITH_LLVM19

compute_100 target

working through compilation errors

working throw sigsegv on vecadd

nix flake

libintrinsics

libintrinsics

chore(llvm19): close out Layer 3 pre-smoke work

Finalize the Layer 3 plan, add env-driven final-module and LLVM IR capture hooks to vecadd, and validate the harness locally so the next phase can move straight to CUDA 12.9+ smoke testing.

refactor(llvm19): close out Layer 2 containment

Add named Rust-side containment helpers for debug info and target machine creation, make the current ThinLTO behavior explicit, and update LLVM19_PLAN.md to mark Layers 2c and 2d complete.

refactor(llvm19): start Layer 2 helper containment

Add a small Rust-side helper surface in src/llvm.rs for call-building, symbol insertion, and debug-location setting, then migrate the obvious callers without introducing LLVM-version cfg branching.

Update LLVM19_PLAN.md to reflect the real Layer 2 state: 2a is complete, 2b is complete, 2c is partially landed, and 2d is still pending. Include the current .gitignore change in this checkpoint as requested.

feat(llvm19): complete Layer 1 C++ shim bridge

Bridge the wrapper headers and C++ shims so rustc_codegen_nvvm now builds against both LLVM 7 and LLVM 19.

This adds the LLVM 19 wrapper headers, ports RustWrapper.cpp and PassWrapper.cpp through the current checkpoint, and records the completed Layer 1 progress and remaining Layer 2 caveats in the plan.

ptxjitcompiler.so

load_ptx_with_log

unified?

Co-Authored-By: OpenAI Codex <codex@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant