Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 11 additions & 14 deletions bins/dwarf-tool/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@

use anyhow::Result;
use clap::{Parser, Subcommand};
use ghostscope_dwarf::core::SectionType;
use ghostscope_dwarf::{
AddressQueryResult, DwarfAnalyzer, FunctionQueryResult, ModuleLoadingEvent, ModuleLoadingStats,
SectionType,
};
use std::path::PathBuf;
use std::sync::{Arc, Mutex};
Expand Down Expand Up @@ -629,7 +629,7 @@ async fn analyze_source_location(

fn iter_address_query_variables<'a>(
address: &'a AddressQueryResult,
) -> impl Iterator<Item = &'a ghostscope_dwarf::VariableWithEvaluation> + 'a {
) -> impl Iterator<Item = &'a ghostscope_dwarf::VisibleVariable> + 'a {
address.parameters.iter().chain(address.variables.iter())
}

Expand All @@ -641,11 +641,11 @@ fn total_variables_in_query_results(addresses: &[AddressQueryResult]) -> usize {
addresses.iter().map(query_address_variable_count).sum()
}

fn variable_info_from_query(variable: &ghostscope_dwarf::VariableWithEvaluation) -> VariableInfo {
fn variable_info_from_query(variable: &ghostscope_dwarf::VisibleVariable) -> VariableInfo {
VariableInfo {
name: variable.name.clone(),
type_name: variable.type_name.clone(),
location: format!("{}", variable.evaluation_result),
location: format!("{}", variable.location),
is_parameter: variable.is_parameter,
scope_depth: variable.scope_depth as u32,
}
Expand Down Expand Up @@ -750,7 +750,7 @@ async fn analyze_function(
format!("{} (no DWARF info)", var.type_name)
};

println!("{}: {} = {}", var.name, type_str, var.evaluation_result);
println!("{}: {} = {}", var.name, type_str, var.location);
}
} else {
println!(" Address: 0x{:x}", address.address);
Expand Down Expand Up @@ -796,10 +796,7 @@ async fn analyze_module_address(
);
} else if options.quiet() {
for var in iter_address_query_variables(&address_info) {
println!(
"{}: {} = {}",
var.name, var.type_name, var.evaluation_result
);
println!("{}: {} = {}", var.name, var.type_name, var.location);
}
} else {
println!("\n=== {module_path} @ 0x{address:x} ===");
Expand Down Expand Up @@ -1175,7 +1172,7 @@ fn percentile_nearest_rank(sorted_samples_ms: &[f64], percentile: f64) -> f64 {
}

fn print_variables_with_style<'a>(
variables: impl IntoIterator<Item = &'a ghostscope_dwarf::VariableWithEvaluation>,
variables: impl IntoIterator<Item = &'a ghostscope_dwarf::VisibleVariable>,
options: &Commands,
) {
for (i, var) in variables.into_iter().enumerate() {
Expand All @@ -1194,7 +1191,7 @@ fn print_variables_with_style<'a>(
println!(" Scope Depth: {}", var.scope_depth);
println!(" Is Parameter: {}", var.is_parameter);
println!(" Is Artificial: {}", var.is_artificial);
println!(" Location: {}", var.evaluation_result);
println!(" Location: {}", var.location);
println!();
} else {
let param_marker = if var.is_parameter { " (param)" } else { "" };
Expand All @@ -1213,14 +1210,14 @@ fn print_variables_with_style<'a>(

println!(
" ├─ {}: {} = {}{}{}",
var.name, type_str, var.evaluation_result, param_marker, artificial_marker
var.name, type_str, var.location, param_marker, artificial_marker
);
}
}
}

fn print_variables_with_indent<'a>(
variables: impl IntoIterator<Item = &'a ghostscope_dwarf::VariableWithEvaluation>,
variables: impl IntoIterator<Item = &'a ghostscope_dwarf::VisibleVariable>,
indent: &str,
) {
for var in variables {
Expand All @@ -1240,7 +1237,7 @@ fn print_variables_with_indent<'a>(

println!(
"{}├─ {}: {} = {}{}{}",
indent, var.name, type_str, var.evaluation_result, param_marker, artificial_marker
indent, var.name, type_str, var.location, param_marker, artificial_marker
);
}
}
Expand Down
17 changes: 12 additions & 5 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ GhostScope uses Cargo workspace for modular design:
|-------|---------|
| **ghostscope** | Main binary and runtime coordinator - orchestrates all components via async event loop |
| **ghostscope-compiler** | Script compilation pipeline - transforms user scripts into verified eBPF bytecode via LLVM |
| **ghostscope-dwarf** | Debug information analyzer - provides cross-module symbol resolution and type information |
| **ghostscope-dwarf** | PC-context DWARF semantic engine - resolves source locations, visible variables, type layouts, address mappings, and compiler read plans |
| **ghostscope-loader** | eBPF program lifecycle manager - handles uprobe attachment and ring buffer management via Aya |
| **ghostscope-ui** | Terminal user interface - implements interactive TUI with TEA (The Elm Architecture) pattern |
| **ghostscope-protocol** | Communication protocol - defines message format for eBPF-userspace data exchange |
Expand Down Expand Up @@ -88,9 +88,10 @@ GhostScope uses Cargo workspace for modular design:

**Key feature**: Progressive loading with callbacks for UI progress updates.

### 3. DWARF Analyzer
### 3. DWARF Semantic Engine

**Role**: High-performance multi-module debug information system.
**Role**: High-performance multi-module debug information system and
PC-context semantic planner.

**Core Optimizations**:

Expand All @@ -115,6 +116,12 @@ GhostScope uses Cargo workspace for modular design:
- Virtual address to file offset conversion
- Runtime address mapping for process-specific traces

5. **PC-Context Read Planning**
- Resolves locals, parameters, globals, and inline scopes at a specific probe PC
- Produces typed read plans for the compiler instead of exposing raw DWARF locations
- Preserves semantic distinctions such as optimized-out values, rebased absolute addresses, and value-backed aggregates
- Reports compile-time diagnostics when a variable is visible but cannot be safely lowered

TODO: Still slow, need to research how GDB optimizes DWARF parsing performance.

### 4. Compilation Pipeline
Expand All @@ -135,9 +142,9 @@ Multi-stage pipeline with type safety at each level:
┌──────────────────────────────────────────────────────────┐
│ Stage 2: LLVM IR Generation │
│ │
│ AST + DWARF Info
│ AST + PC Context + DWARF Read Plans
│ ↓ │
Symbol Resolution (variables, types, locations)
Plan Lowering (variables, types, availability)
│ ↓ │
│ LLVM IR (type-safe intermediate representation) │
└──────────────────────────────────────────────────────────┘
Expand Down
14 changes: 7 additions & 7 deletions docs/comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,17 +68,17 @@ We now ship a reproducible single-thread benchmark for one narrow question: "wha

| Aspect | GhostScope | perf probe / perf uprobes |
|---|---|---|
| Positioning | Purpose-built userspace tracer with runtime DWARF evaluation, a small DSL, and a TUI/session workflow | Declarative probe-definition frontend plus the broader perf recording and reporting pipeline |
| Positioning | Purpose-built userspace tracer with PC-context DWARF planning, a small DSL, and a TUI/session workflow | Declarative probe-definition frontend plus the broader perf recording and reporting pipeline |
| Programmability and safety model | eBPF-backed collection logic with programmable filtering and formatting; flexibility is constrained by the verifier | Narrower, more declarative capability surface: define probe points and fetchargs, but not an eBPF-style "run custom logic on each hit" programming model |
| Source-level frontend | Function, source-line, and instruction-oriented tracing are core workflows | Strong native support for functions, source lines, locals, and inline-related probe discovery inside `perf probe` |
| Variable access style | Runtime DWARF evaluation for locals, parameters, and globals; typed rendering in the tracer UI | Declarative fetchargs for locals, parameters, registers, symbols, arrays, and return values |
| Variable access style | Compile/load-time DWARF read planning for locals, parameters, and globals, followed by eBPF runtime reads and typed rendering | Declarative fetchargs for locals, parameters, registers, symbols, arrays, and return values |
| Inline and discovery workflow | Good source-driven attachment, but within GhostScope's tracer model | Mature discovery workflow for lines, functions, and inline-related probe search such as `--line`, `--vars`, and `--no-inlines` |
| What happens after a hit | Structured data can be filtered, sampled, and shaped before delivery to userspace | Mostly fixed event-field extraction, then hand off to the perf recording and reporting toolchain |
| Output and consumption | RingBuf or PerfEventArray to a custom realtime reader/TUI | Common path is `perf probe` -> `perf record` -> `perf.data` -> `perf report` or `perf script` |
| Best at | Production-oriented live userspace diagnosis with structured output and a dedicated runtime workflow | Quick one-off probes and reuse of the existing perf ecosystem |
| Tradeoff | More opinionated; not meant to be the general perf toolkit | Less programmable than eBPF-based tracers and less centered on custom realtime processing |

Choose GhostScope when you want a purpose-built online tracer with runtime DWARF semantics, programmable filtering, and a friendlier live diagnosis workflow. Choose perf when you want to quickly place a function, source-line, or local-variable probe and stay inside the perf ecosystem.
Choose GhostScope when you want a purpose-built online tracer with PC-context DWARF semantics, programmable filtering, and a friendlier live diagnosis workflow. Choose perf when you want to quickly place a function, source-line, or local-variable probe and stay inside the perf ecosystem.

Background: a practical shorthand is `perf probe` = more fixed-semantics and ready-to-use, not "zero configurability"; GhostScope's eBPF-backed tracer model trades that simplicity for more programmable hit handling and a richer live workflow.

Expand All @@ -87,10 +87,10 @@ Background: a practical shorthand is `perf probe` = more fixed-semantics and rea
| Aspect | GhostScope | bpftrace |
|---|---|---|
| Positioning | DWARF-aware userspace observation; restores source-level semantics | General-purpose eBPF dynamic tracer; event observation and aggregation |
| DWARF usage | Evaluates DWARF expressions at runtime; reads params, locals, and globals | Parses args and structs; not centered on runtime evaluation of location expressions |
| DWARF usage | Plans variable reads from DWARF at compile/load time, then emits eBPF reads for params, locals, and globals | Parses args and structs; not centered on PC-context variable read planning |
| Attachment granularity and symbols | Line-table-driven source-line and instruction attachment, plus function-oriented tracing | Entry/return, in-function offsets, absolute locations, and event probes; no built-in line-to-address workflow |
| Observable data | Supports locals, parameters, and globals; renders values with real types | Strong for arguments, structs, and event streams; less focused on recovering arbitrary live userspace state |
| ASLR impact | Runtime DWARF computation naturally adapts to ASLR and PIE | `uaddr()`-style global reads become awkward or unavailable under ASLR and PIE |
| ASLR impact | DWARF read plans preserve rebasing requirements for PIE, shared libraries, and absolute-address values | `uaddr()`-style global reads become awkward or unavailable under ASLR and PIE |
| Interaction experience | TUI-friendly, observe without interruption | Script-style output and aggregation; less interactive |
| Best at | Recovering real userspace state from live code paths | Correlating many event sources quickly |
| Tradeoff | Narrower scope | Less focused on source-level userspace diagnosis |
Expand All @@ -105,8 +105,8 @@ Background: one motivation for GhostScope was that newer bpftrace versions no lo
|---|---|---|
| Position and scope | DWARF-aware userspace observation aimed at production printf-style debugging with an interactive workflow | Broad tracing framework with kernel and userspace coverage, including an eBPF backend |
| Source line and statement probes | Supported; line-level attachment is a core path | Supported; statement probes can be resolved and attached |
| Variable access (params, locals, globals) | Supported. Evaluate DWARF at runtime with gimli; render by real types; naturally ASLR and PIE friendly | Supported. DWARF location expressions are lowered through SystemTap's pipeline into eBPF-compatible logic, with verifier and stack constraints |
| DWARF expression handling | Evaluate DWARF in userspace and collect values via eBPF programs | Translate DWARF operations into internal representations and lower them into eBPF instruction sequences |
| Variable access (params, locals, globals) | Supported. Build PC-context read plans with gimli-backed DWARF data; render by real types; naturally ASLR and PIE friendly | Supported. DWARF location expressions are lowered through SystemTap's pipeline into eBPF-compatible logic, with verifier and stack constraints |
| DWARF expression handling | Convert DWARF locations into semantic read plans and lower supported plans into eBPF runtime reads | Translate DWARF operations into internal representations and lower them into eBPF instruction sequences |
| Stack unwinding (CFI) | Not supported yet; planned via `.eh_frame` unwinding | Not supported in the eBPF backend |
| Event transport and formatting | RingBuf (on newer kernels) or PerfEventArray; configurable pages and event size; built-in dump helpers such as `{:x.N}`, `{:s.N}`, and `{:p}` | PERF_EVENT_ARRAY plus userspace formatting/interpreter flow; formatting and string handling are more constrained |
| BTF, CO-RE, linkage | Aya ecosystem, prefer RingBuf; not centered on BTF or CO-RE | No BTF or CO-RE focus; minimal libbpf-style backend |
Expand Down
16 changes: 11 additions & 5 deletions docs/zh/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ GhostScope 使用 Cargo workspace 进行模块化设计:
|-------|------|
| **ghostscope** | 主程序和运行时协调器 - 通过异步事件循环协调所有组件 |
| **ghostscope-compiler** | 脚本编译流水线 - 通过 LLVM 将用户脚本转换为经过验证的 eBPF 字节码 |
| **ghostscope-dwarf** | 调试信息分析器 - 提供跨模块符号解析和类型信息 |
| **ghostscope-dwarf** | PC 上下文 DWARF 语义引擎 - 解析源码位置、可见变量、类型布局、地址映射和编译器读取计划 |
| **ghostscope-loader** | eBPF 程序生命周期管理器 - 通过 Aya 处理 uprobe 附加和 ring buffer 管理 |
| **ghostscope-ui** | 终端用户界面 - 实现基于 TEA (The Elm Architecture) 模式的交互式 TUI |
| **ghostscope-protocol** | 通信协议 - 定义 eBPF 与用户态数据交换的消息格式 |
Expand Down Expand Up @@ -88,9 +88,9 @@ GhostScope 使用 Cargo workspace 进行模块化设计:

**关键特性**:渐进式加载,带有 UI 进度更新回调。

### 3. DWARF 分析器
### 3. DWARF 语义引擎

**角色**:高性能多模块调试信息系统。
**角色**:高性能多模块调试信息系统,以及基于 PC 上下文的语义规划器

**核心优化**:

Expand All @@ -115,6 +115,12 @@ GhostScope 使用 Cargo workspace 进行模块化设计:
- 虚拟地址到文件偏移的转换
- 针对特定进程追踪的运行时地址映射

5. **PC 上下文读取计划**
- 在指定 probe PC 上解析局部变量、参数、全局变量和 inline 作用域
- 向编译器输出带类型的读取计划,而不是暴露原始 DWARF 位置
- 保留 optimized-out、需要重定位的绝对地址、value-backed 聚合等语义差异
- 当变量可见但无法安全 lower 时,给出编译期诊断

TODO: 但是依然很慢,需要继续研究 GDB 是怎么提升解析 DWARF 性能的。

### 4. 编译流水线
Expand All @@ -135,9 +141,9 @@ TODO: 但是依然很慢,需要继续研究 GDB 是怎么提升解析 DWARF
┌──────────────────────────────────────────────────────────┐
│ 阶段 2:LLVM IR 生成 │
│ │
│ AST + DWARF 信息
│ AST + PC 上下文 + DWARF 读取计划
│ ↓ │
符号解析(变量、类型、位置)
计划 Lowering(变量、类型、可用性)
│ ↓ │
│ LLVM IR(类型安全的中间表示) │
└──────────────────────────────────────────────────────────┘
Expand Down
Loading
Loading