Commit b6f2344
feat(pathfinder): add CTK root canary probe for non-standard-path libs (#1595)
* feat(pathfinder): add CTK root canary probe for non-standard-path libs
Libraries like nvvm whose shared object lives in a subdirectory
(/nvvm/lib64/) that is not on the system linker path cannot
be found via bare dlopen on system CTK installs without CUDA_HOME.
Add a "canary probe" search step: when direct system search fails,
system-load a well-known CTK lib that IS on the linker path (cudart),
derive the CTK installation root from its resolved path, and look for
the target lib relative to that root via the existing anchor-point
logic. The mechanism is generic -- any future lib with a non-standard
path just needs its entry in _find_lib_dir_using_anchor_point.
The canary probe is intentionally placed after CUDA_HOME in the search
cascade to preserve backward compatibility: users who have CUDA_HOME
set expect it to be authoritative, and existing code relying on that
ordering should not silently change behavior.
Co-authored-by: Cursor <cursoragent@cursor.com>
* style(pathfinder): update copyright header date in test file
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(pathfinder): use pytest-mock instead of unittest.mock in tests
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: fix typing
* fix(pathfinder): make CTK root discovery tests platform-aware
Tests that create fake CTK directory layouts were hardcoded to Linux
paths (lib64/, libnvvm.so) and failed on Windows where the code
expects Windows layouts (bin/, nvvm64.dll).
Extract platform-aware helpers (_create_nvvm_in_ctk, _create_cudart_in_ctk,
_fake_canary_path) that create the right layout and filenames based on
IS_WINDOWS.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: style
* fix(pathfinder): normalize paths from _find_lib_dir_using_anchor_point
The rel_paths for nvvm use forward slashes (e.g. "nvvm/bin") which
os.path.join on Windows doesn't normalize, producing mixed-separator
paths like "...\nvvm/bin\nvvm64.dll". Apply os.path.normpath to the
returned directory so all separators are consistent.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(pathfinder): isolate CTK canary probe in subprocess
Resolve CTK canary absolute paths in a spawned Python process so probing cudart does not mutate loader state in the caller process while preserving the nvvm discovery fallback order. Keep JSON as the child-to-parent wire format because it cleanly represents both path and no-result states and avoids fragile stdout/path parsing across platforms.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(pathfinder): satisfy pre-commit typing for canary probe
Make canary subprocess path extraction explicitly typed and validated so mypy does not treat platform-specific loader results as Any while keeping probe behavior unchanged. Keep import ordering aligned with Ruff so pre-commit is green.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(pathfinder): use spawn isolation for CTK canary probing
Switch canary path resolution from subprocess.run to a shared multiprocessing spawn runner so child probes do not inherit potentially preloaded CUDA libraries from a forked parent. Reuse that runner from tests to keep one implementation for spawned process behavior.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(pathfinder): satisfy pre-commit for spawned runner utilities
Add the missing type annotations required by mypy and keep the test shim exporting only the runner entry point so lint checks pass cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(pathfinder): fail fast on canary probe child errors
Treat only a missing canary library as a recoverable probe miss and surface all other child-process failures immediately. This prevents development bugs from being silently masked as normal CTK canary fallback behavior.
Co-authored-by: Cursor <cursoragent@cursor.com>
* canary_probe_subprocess.py: remove unused main()
* refactor(pathfinder): simplify spawned runner usage in tests
Remove the tests-only re-export shim and import the shared spawned-process runner directly from pathfinder utils. This makes it more obvious that there is only one implementation.
Co-authored-by: Cursor <cursoragent@cursor.com>
* Extend copyright date back to original code.
* test(pathfinder): assert canary probe subprocess rethrows failures
Lock in the fail-fast canary policy by asserting the subprocess runner is invoked with rethrow enabled. This guards against accidental regressions that would silently downgrade child-process errors to probe misses.
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs(pathfinder): clarify why canary probe must use spawn
Document that the canary probe runs in a spawned (not forked) child process so it starts from a clean interpreter state without inherited preloaded CUDA libraries. This explains why spawn is required for an independent system-search probe.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(pathfinder): simplify CTK canary fallback scope
Centralize canary configuration in supported_nvidia_libs and gate CTK-root canary probing to discoverable libnames (currently nvvm). This avoids unnecessary canary subprocess work for other libraries and locks the behavior with a focused regression test.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore(pathfinder): cache canary anchor probe lookups
Cache canary anchor-path resolution to avoid redundant spawned probe work while preserving retry-on-exception behavior. This is mostly a completeness/quality-of-implementation improvement today since only one discoverable lib currently uses the canary path, and tests now clear the cache per case to preserve isolation.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(pathfinder): derive CTK root from Linux targets layouts
Handle cudart paths under targets/<triple>/lib{,64} when deriving CTK root for canary-based nvvm discovery. Add focused regression tests for both lib64 and lib variants.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rwgkio@gmail.com>1 parent 67251a3 commit b6f2344
9 files changed
Lines changed: 576 additions & 13 deletions
File tree
- cuda_pathfinder
- cuda/pathfinder
- _dynamic_libs
- _utils
- tests
Lines changed: 30 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
Lines changed: 69 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | | - | |
| 104 | + | |
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
| |||
152 | 152 | | |
153 | 153 | | |
154 | 154 | | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
155 | 213 | | |
156 | 214 | | |
157 | 215 | | |
| |||
185 | 243 | | |
186 | 244 | | |
187 | 245 | | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
188 | 256 | | |
189 | 257 | | |
190 | 258 | | |
| |||
Lines changed: 94 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
8 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
9 | 14 | | |
10 | 15 | | |
| 16 | + | |
| 17 | + | |
11 | 18 | | |
12 | 19 | | |
13 | 20 | | |
14 | 21 | | |
| 22 | + | |
15 | 23 | | |
16 | 24 | | |
17 | 25 | | |
| |||
60 | 68 | | |
61 | 69 | | |
62 | 70 | | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
63 | 132 | | |
64 | 133 | | |
65 | 134 | | |
| |||
90 | 159 | | |
91 | 160 | | |
92 | 161 | | |
| 162 | + | |
93 | 163 | | |
94 | | - | |
95 | | - | |
96 | | - | |
| 164 | + | |
97 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
98 | 180 | | |
99 | 181 | | |
100 | 182 | | |
| |||
164 | 246 | | |
165 | 247 | | |
166 | 248 | | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
167 | 257 | | |
168 | 258 | | |
169 | 259 | | |
| |||
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
358 | 358 | | |
359 | 359 | | |
360 | 360 | | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
361 | 368 | | |
362 | 369 | | |
363 | 370 | | |
| |||
Lines changed: 9 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
28 | 34 | | |
29 | 35 | | |
30 | 36 | | |
31 | 37 | | |
32 | 38 | | |
33 | | - | |
| 39 | + | |
34 | 40 | | |
35 | 41 | | |
36 | 42 | | |
| |||
0 commit comments