Skip to content

Commit 69f711d

Browse files
authored
Add GSoC 2025 final post on advanced symbol resolution. (#351)
* Add GSoC 2025 final post on advanced symbol resolution * Update spell check ignore list
1 parent 4e08b7a commit 69f711d

2 files changed

Lines changed: 217 additions & 0 deletions

File tree

.github/actions/spelling/allow/names.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Guilherme
2828
Guiraud
2929
Hageboeck
3030
Hahnfeld
31+
Hames
3132
Harshitha
3233
Hilendarski
3334
Ikarashi
@@ -41,6 +42,7 @@ Joshi
4142
Jurgaityt
4243
Kyiv
4344
LBNL
45+
Lang
4446
Lattner
4547
Lavrijsen
4648
Li
@@ -172,6 +174,7 @@ kundu
172174
kundubaidya
173175
lange
174176
li
177+
lhames
175178
lucas
176179
maksym
177180
manasi
Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
---
2+
title: "Wrapping Up GSoC 2025: Advanced symbol resolution for Clang-Repl"
3+
layout: post
4+
excerpt: "Advanced symbol resolution and re-optimization for Clang-Repl is a Google Summer of Code 2025 project. It aims to improve Clang-Repl and ORC JIT by adding support for automatically loading dynamic libraries when symbols are missing. This removes the need for users to load libraries manually and makes things work more smoothly."
5+
sitemap: false
6+
author: Sahil Patidar
7+
permalink: blogs/gsoc25_sahil_wrapup_blog/
8+
banner_image: /images/blog/gsoc_clang_repl.jpeg
9+
date: 2026-01-15
10+
tags: gsoc LLVM clang-repl ORC-JIT auto-loading
11+
---
12+
13+
## Introduction
14+
15+
Hello! I’m Sahil Patidar, and this summer I had the opportunity to participate in Google Summer of Code (GSoC) 2025 with the LLVM Organization.
16+
My project focused on enhancing ORC-JIT and `Clang-Repl` by introducing a new feature for advanced symbol resolution, aimed at improving runtime symbol handling and flexibility.
17+
18+
**Mentors**: Vassil Vassilev, Aaron Jomy
19+
20+
## Overview of the Project
21+
22+
[Clang-Repl](https://clang.llvm.org/docs/ClangRepl.html) is an interactive C++ interpreter built on top of LLVM’s ORC JIT, enabling incremental compilation and execution.
23+
However, when user code references symbols from external libraries, those libraries must currently be loaded manually. This happens because ORC JIT does not automatically resolve symbols from libraries that haven’t been loaded yet.
24+
25+
To overcome this limitation, my project introduces an automatic library resolver for unresolved symbols in ORC JIT, improving Clang-Repl’s runtime by making external symbol handling seamless and user-friendly.
26+
27+
## Project Goals
28+
29+
The main goal of my project was to design and implement a new Library-Resolution API for ORC-JIT.
30+
This API acts as a smart symbol resolver — when ORC-JIT encounters an unresolved symbol, it can call this API to find where the symbol exists and which library provides it.
31+
32+
The next step is to integrate this API into ORC-JIT, so that `Clang-Repl` can automatically use it to handle missing symbols without requiring manual library loading.
33+
34+
## Library-Resolution
35+
36+
During my GSoC project, one of the major components I worked on was Library-Resolution — an API we re-designed and re-implemented based on Cling’s original library-resolver.
37+
38+
In simple terms, Library-Resolution acts as a smart library resolver.
39+
It doesn’t actually load libraries — instead, it finds where the missing symbols (unresolved references) can be found and provides their correct library paths.
40+
41+
This makes it a powerful helper when dealing with unresolved symbols during execution.
42+
43+
44+
### How It Works
45+
46+
When system (Orc-JIT or any) encounters an unresolved symbol, system can call the resolver to find symbols that not found.
47+
It scans through user-provided library paths, checks potential matches, and identifies the libraries that contain the missing symbols — all without directly loading them.
48+
49+
At the heart of this system is the `LibraryResolver`, which runs the resolution process by:
50+
51+
1. Scanning available libraries.
52+
2. Filtering symbols efficiently using Bloom filters.
53+
3. Matching unresolved symbols through a `SymbolQuery` tracker.
54+
55+
The result: symbols are mapped to their correct library paths, and system can continue execution seamlessly.
56+
57+
58+
### Core Components Overview
59+
60+
Here’s a quick breakdown of the key components that make Library-Resolution work:
61+
62+
#### 1. LibraryResolver
63+
64+
The main coordinator that controls the entire flow — from scanning libraries to managing symbol lookups.
65+
It ensures that unresolved symbols are systematically matched to libraries.
66+
67+
#### 2. LibraryScanner
68+
69+
Handles the actual scanning of directories and library paths.
70+
It detects valid shared libraries and registers them with the `LibraryManager`.
71+
72+
* **LibraryScanHelper** - Keeps track of directories that need to be scanned.
73+
* **LibrarySearchPath** - Represents a directory and its type (User/System) along with its current scan state.
74+
* **PathResolver** - Normalizes and resolves file paths efficiently.
75+
* **LibraryPathCache** - Stores already-resolved paths and symbolic links to prevent repeated filesystem checks.
76+
77+
#### 3. LibraryManager
78+
79+
Maintains metadata about all discovered libraries.
80+
Each library is represented by a `LibraryInfo` object containing:
81+
82+
* Library path
83+
* Load status (loaded or not)
84+
* A **Bloom filter** for fast symbol existence checks
85+
86+
Here’s a more **blog-friendly, clear, and polished** version of your section, with smoother flow and simpler language while keeping the technical meaning intact.
87+
88+
## Symbol Resolution Flow
89+
90+
So how does symbol resolution actually work? Let’s walk through the process step by step.
91+
92+
1. **Start with unresolved symbols**
93+
The process begins with a list of unresolved symbols. These are passed to `LibraryResolver::searchSymbolsInLibraries`, where a `SymbolQuery` object is created to track the resolution state.
94+
95+
2. **Scan available libraries**
96+
The resolver scans both user-defined and system library paths to discover new or previously unregistered libraries that may contain the missing symbols.
97+
98+
3. **Filter symbols**
99+
As each library is inspected, we filter symbols that are guaranteed to exist. If a library doesn’t already have a Bloom filter, one is created. This filter allows for much faster symbol lookups in future scans.
100+
101+
4. **Match symbols to libraries**
102+
Each unresolved symbol is checked against the Bloom filters. When a potential match is found, it is verified, and the symbol is linked to the corresponding library path.
103+
104+
5. **Repeat until done**
105+
This cycle continues until all symbols are resolved or there are no remaining libraries that could provide valid matches.
106+
107+
6. **Complete and return results**
108+
Once the process finishes, the final resolution results are returned through a completion callback.
109+
110+
## Summary of accomplished tasks
111+
112+
### ExecutorResolver:
113+
[143654](https://github.com/llvm/llvm-project/pull/143654)
114+
suggested by Lang Hames. We introduced a `DylibSymbolResolver` that helps resolve symbols for each loaded dylib.
115+
116+
Previously, we returned a DylibHandle to the controller. Now, we wrap the native handle inside `DylibSymbolResolver` and return a `ResolverHandle` instead. This makes the code cleaner and separates the symbol resolution logic from raw handle management.
117+
118+
with this changes this will help us to integrate LibraryResolver API using some future through new `AutoDylibResolver`.
119+
120+
### Library-Resolver API:
121+
[#165360](https://github.com/llvm/llvm-project/pull/165360)
122+
123+
This is the main API we redesigned based on cling auto library-resolver. this api provide way to user to add search-path and ask for symbols to search and provide resolved library for each symbols.
124+
125+
The goal is to make library discovery and symbol resolution more straightforward, while keeping the design flexible for future improvements.
126+
127+
## What the Library Resolution API Can Do
128+
129+
With these updates, the **Library Resolution API** is now fully operational. It can find missing symbols at runtime and figure out which shared libraries they belong to — without loading those libraries into memory.
130+
131+
The API searches through both system and user-defined paths, looks for unresolved symbols, and pinpoints the exact libraries where those symbols are defined.
132+
133+
Because of this, the API is especially useful for dynamic runtime systems like **ORC-JIT** and **Clang-Repl**, where symbols often need to be resolved on the fly without slowing things down or breaking execution.
134+
135+
Below is a simple example showing how to set up the API and start the resolution process:
136+
137+
```cpp
138+
llvm::orc::LibraryResolver::Setup S =
139+
llvm::orc::LibraryResolver::Setup::create({});
140+
141+
// Define a callback that decides whether a library should be scanned
142+
S.ShouldScanCall = [&](llvm::StringRef lib) -> bool { return true; };
143+
144+
// Create the driver that coordinates the resolution
145+
Controller = llvm::orc::LibraryResolutionDriver::create(S);
146+
147+
// Add user and system library paths to be scanned
148+
for (const auto &SP : SearchPaths)
149+
Controller->addScanPath(SP, llvm::orc::PathType::User);
150+
151+
// Prepare the symbols to be resolved
152+
SmallVector<StringRef> Sym;
153+
Sym.push_back(MangledName);
154+
155+
// Configure resolution policy
156+
llvm::orc::SearchConfig Config;
157+
Config.Policy = {
158+
{{llvm::orc::LibraryManager::LibState::Queried, llvm::orc::PathType::User},
159+
{llvm::orc::LibraryManager::LibState::Unloaded, llvm::orc::PathType::User},
160+
{llvm::orc::LibraryManager::LibState::Queried, llvm::orc::PathType::System},
161+
{llvm::orc::LibraryManager::LibState::Unloaded, llvm::orc::PathType::System}}};
162+
163+
Config.Options.FilterFlags =
164+
llvm::orc::SymbolEnumeratorOptions::IgnoreUndefined;
165+
166+
// Run the symbol resolution
167+
Controller->resolveSymbols(
168+
Sym,
169+
[&](llvm::orc::LibraryResolver::SymbolQuery &Q) {
170+
if (auto S = Q.getResolvedLib(MangledName))
171+
Res = *S;
172+
},
173+
Config);
174+
```
175+
176+
### Other PRs
177+
[166510](https://github.com/llvm/llvm-project/pull/166510)
178+
[166147](https://github.com/llvm/llvm-project/pull/166147)
179+
[169161](https://github.com/llvm/llvm-project/pull/169161)
180+
181+
## Future Work
182+
183+
The next step will be to continue development on the ORC-JIT side, aligned with the ongoing evolution of the Executor layer.
184+
Once the new Executor design stabilizes, we’ll revisit the Library-Resolution API and make any necessary adjustments for compatibility and cleaner integration.
185+
This work will be done under the guidance of Lang Hames, ensuring it fits well within the evolving ORC architecture.
186+
187+
The next phase of this project focuses on integrating the Library-Resolution API into ORC-JIT.
188+
Specifically, we plan to:
189+
190+
* **Introduce the AutoDylibResolver** — based on the groundwork implemented in the *ExecutorResolver* pull request.
191+
This component will allow ORC-JIT to automatically invoke the Library-Resolution API whenever it encounters an unresolved symbol.
192+
193+
* **Enable automatic library loading** in ORC-JIT.
194+
Once integrated, ORC-JIT (and tools like `Clang-Repl`) will be able to automatically locate and load the required libraries during runtime — removing the need for users to manually load them.
195+
196+
This next step will complete the feature chain — from symbol detection to automatic library resolution and loading — making `Clang-Repl` more user-friendly.
197+
198+
## Conclusion
199+
200+
With this project, we now have a working Library-Resolver API implemented in ORC-JIT. This is an important step toward making library handling more automatic and reliable at runtime.
201+
The next phase of work will focus on integrating this API more deeply into ORC-JIT. This will allow automatic library loading, which will directly benefit tools like `Clang-Repl` and other projects that rely on ORC-JIT as their execution engine.
202+
This project has been a great learning experience, and I’m excited about the improvements it can bring to the LLVM ecosystem.
203+
Thank you for following along on my GSoC 2025 journey!
204+
205+
## Acknowledgements
206+
207+
I would like to thank Google Summer of Code (GSoC) and LLVM for the opportunity to work on this project. Special thanks to my mentor Vassil Vassilev for his guidance and support, and to Lang Hames for his helpful insights on ORC-JIT and Clang-Repl.
208+
209+
210+
## Related Links
211+
212+
- [LLVM Repository](https://github.com/llvm/llvm-project)
213+
- [Project Description](https://discourse.llvm.org/t/gsoc2025-advanced-symbol-resolution-and-reoptimization-for-clang-repl/84624/3)
214+
- [My GitHub Profile](https://github.com/SahilPatidar)

0 commit comments

Comments
 (0)