Skip to content

Commit 658bbe6

Browse files
committed
fix(crash-symbolize): resolve libdwarf crash and incorrect line mapping
- Fix potential crash in libdwarf call path: - Reopen file descriptor before DWARF parsing - Filter invalid CU using DW_AT_stmt_list - Correct usage of output parameters for dwarf_srclines_b - Fix incorrect line mapping result: - Move 'best match range' logic to correct scope - Ensure line selection is based on per-CU context instead of global state These changes prevent NULL dereference inside libdwarf and improve accuracy of source line resolution.
1 parent e5b5a46 commit 658bbe6

6 files changed

Lines changed: 744 additions & 132 deletions

File tree

agent/src/ebpf/crash-monitor.md

Lines changed: 36 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -90,9 +90,11 @@ crash monitor 是**诊断工具**,不是恢复工具。handler 在写完快照
9090
3. 为当前线程准备 altstack;
9191
4. 读取 `/proc/self/maps`,预缓存可执行 file-backed mappings;
9292
5. 记录 `/proc/self/exe` 对应的主程序路径;
93-
6. 为每个缓存模块预读取 GNU build-id(若存在)。
93+
6. 缓存 `/proc/self/comm` 对应的进程/task 名称,作为主线程或旧快照的兜底名字;
94+
7. 为每个缓存模块预读取 GNU build-id(若存在);
95+
8. 对通过 monitored helper 创建的 worker thread,在进入真正工作函数前先把预期线程名写入 crash monitor 的线程本地缓存,再执行 `crash_monitor_prepare_thread()`
9496

95-
这些预处理让 handler 在崩溃当下只需要做“复制固定数据 + 写盘”,而不需要临时发现模块布局。
97+
这些预处理让 handler 在崩溃当下只需要做“复制固定数据 + 写盘”,而不需要临时发现模块布局,也不需要在 signal context 里再去读取 `/proc/thread-self/comm` 之类的名字信息
9698

9799
### 3.2 Stage-1 的崩溃捕获(fatal signal context)
98100

@@ -101,7 +103,7 @@ fatal signal 到来后,handler 在 altstack 上执行,主要完成:
101103
1.`siginfo_t` 读取 `si_code``si_addr` 等故障信息;
102104
2.`ucontext_t` 提取寄存器:`ip/sp/fp/lr/args[]`
103105
3. 基于 frame pointer 做有界回溯;
104-
4. 将预缓存的 `modules[]``modules_count``executable_path` 复制到 record;
106+
4. 将预缓存的 `modules[]``modules_count``executable_path``thread_name` 复制到 record;
105107
5. 为每个 frame 填充:
106108
- `absolute_pc`
107109
- `module_index`
@@ -227,9 +229,12 @@ fatal signal 到来后,handler 在 altstack 上执行,主要完成:
227229
- `ip/sp/fp/lr`:顶层寄存器快照;
228230
- `args[]`:top frame 的 ABI 参数寄存器;
229231
- `executable_path`:主程序路径;
232+
- `thread_name`:崩溃线程名;如果是旧版本快照则该字段可能为空;
230233
- `modules_count` + `modules[]`:模块元数据;
231234
- `frames_count` + `frames[]`:栈帧数组。
232235
236+
当前 snapshot ABI 已提升到 **v3**。v3 相比早期版本新增了 `thread_name` 字段,用于把崩溃线程(或至少主进程/task)的名字持久化到快照中,供下次启动时的 Stage-2 summary 直接打印。为了避免旧容器里残留的历史 `.crash` 文件因为结构体大小变化而完全无法消费,当前读取侧会兼容旧版 record,并在内存中升级成当前结构后再走统一的 Stage-2 符号化流程。
237+
233238
### 6.5 `args[]` 的能力边界
234239
235240
`args[]` 只是 **top frame 的原始 ABI argument registers**,不代表:
@@ -363,9 +368,9 @@ handler 最终通过打开好的 `crash_snapshot_fd` 调用 `write()` 追加写
363368
启动时 Stage-2 会:
364369
365370
1. 打开 `.crash` 文件;
366-
2. 按 `sizeof(struct crash_snapshot_record)` 循环读取
371+
2. 循环读取 record header,并根据 `version/size` 判断当前条目属于新格式还是旧格式
367372
3. 校验 `magic/version/size`;
368-
4. 对每条合法记录调用 `crash_symbolize_record()`;
373+
4. 对旧版本 record 做内存内升级,然后统一调用 `crash_symbolize_record()`;
369374
5. 读取结束后 `ftruncate(fd, 0)` 清空文件。
370375
371376
如果出现:
@@ -376,18 +381,21 @@ handler 最终通过打开好的 `crash_snapshot_fd` 调用 `write()` 追加写
376381
377382
### 8.2 crash summary
378383
379-
在真正逐帧符号化之前,Stage-2 先打印 crash summary,包含
384+
在真正逐帧符号化之前,Stage-2 先打印 crash summary。当前 summary 除了基础崩溃元数据之外,还额外强调“这是哪个 task 崩的”和“当前磁盘上的可执行文件是否还是同一个镜像”。因此 summary 现在包含
380385
386+
- `task`:优先来自 snapshot 的 `thread_name`;如果消费的是旧版快照,则回退到 `executable_path` 的 basename;
381387
- `signal`
382388
- `si_code`
383389
- `pid`
384390
- `tid`
385391
- `executable`
392+
- `executable_md5`:Stage-2 对 `executable_path` 指向的当前文件做 best-effort MD5 计算,便于快速判断本次解析所面对的磁盘镜像是否与崩溃时记录的可执行路径一致;
386393
- `ip`
387394
- `fault_addr`
388395
- `frames`
396+
- `args[]`:紧跟在 summary 后打印 top frame 的原始 ABI 参数寄存器值。在 x86_64 上对应 `rdi/rsi/rdx/rcx/r8/r9`,在 aarch64 上对应 `x0-x7`。
389397
390-
这样即使后续所有帧都无法完整恢复,至少仍能得到一条可读的崩溃摘要。
398+
这样即使后续所有帧都无法完整恢复,至少仍能得到一条可读的崩溃摘要;而 `task + executable_md5` 又进一步降低了“只知道哪个 pid/tid 崩了,但不知道到底是哪条线程、当前镜像是否已被替换”的排障成本
391399
392400
### 8.3 单帧符号化的恢复顺序
393401
@@ -463,20 +471,25 @@ Stage-2 的实际行为是:
463471
464472
### 8.7 当前输出示意
465473
466-
可能出现的日志形态包括:
474+
当前 Stage-2 在输出一条恢复出的 crash report 时,会先打印一条明显的分隔线,再输出 summary 与逐帧日志,最后再打印一条相同的分隔线,便于在普通 agent 日志中快速定位整段 crash report。
467475
468-
```text
469-
Recovered crash snapshot: signal=11 code=1 pid=123 tid=456 executable=/usr/bin/deepflow-agent ip=0x7f... fault_addr=0x0 frames=6
470-
```
476+
可能出现的日志形态包括:
471477
472478
```text
479+
=========================================================
480+
Recovered crash snapshot: task=deepflow-agent signal=11 code=1 pid=123 tid=456 executable=/usr/bin/deepflow-agent executable_md5=0123456789abcdef0123456789abcdef ip=0x7f... fault_addr=0x0 frames=6
481+
Recovered crash args: rdi=0x1 rsi=0x7f1234567000 rdx=0x0 rcx=0x2a r8=0x7f1234500000 r9=0x0
473482
Recovered crash frame[0]: pc=0x7f... module=/usr/bin/deepflow-agent rel=0x1234 symbol=foo+0x18 file=/root/project/foo.c:87 build_id=abcd...
474-
```
475-
476-
```text
477483
Recovered crash frame[3]: pc=0x7f... module=/lib64/libc.so.6 rel=0x2a1f0
484+
=========================================================
478485
```
479486

487+
这里要注意三点:
488+
489+
1. `task` 字段优先来自崩溃线程的缓存名字;如果消费的是旧版快照或名字不可得,则会回退到可执行文件名。
490+
2. `executable_md5` 是**Stage-2 在恢复时**对当前 `executable_path` 指向文件做的 best-effort 摘要,而不是 Stage-1 在崩溃当下持久化进 snapshot 的字段。因此它更适合作为“当前恢复环境中的镜像指纹”来辅助对比,而不是把它理解为崩溃瞬间的额外 on-disk ABI 数据。
491+
3. `args[]` 打印的是崩溃线程 top frame 的**原始寄存器参数值**,不是经过调试信息反推后的源码级函数参数列表。也就是说,它不能覆盖 stack-passed 参数、浮点参数、被优化掉的参数,older frames 的参数也不在这个输出保证范围内。
492+
480493
---
481494

482495
## 9. ELF / DWARF helper 能力
@@ -520,16 +533,18 @@ Recovered crash frame[3]: pc=0x7f... module=/lib64/libc.so.6 rel=0x2a1f0
520533

521534
当前线程覆盖依赖 monitored helper:
522535

523-
- 在线程真正进入工作函数前,统一调用 `crash_monitor_prepare_thread()`
536+
- 在线程真正进入工作函数前,先把调用方传入的线程名写入 crash monitor 的线程本地缓存;
537+
- 然后统一调用 `crash_monitor_prepare_thread()`;
524538
- 再进入原始 worker routine。
525539

526540
这样设计的好处:
527541

528542
- 不需要每个线程入口重复手写 altstack 初始化;
529543
- 接入和审计更统一;
530-
- 新增 C/eBPF worker 时,只要继续复用现有 monitored helper,即可自动纳入 crash monitor 保护范围。
544+
- 新增 C/eBPF worker 时,只要继续复用现有 monitored helper,即可自动纳入 crash monitor 保护范围;
545+
- 崩溃时不需要在 signal handler 里再去访问 `/proc/thread-self/comm`,线程名已经在正常上下文里准备好了。
531546

532-
需要注意的是:如果未来新增线程绕过了 monitored helper,那么它即使进程里安装了 fatal handler,也仍可能因为没有 altstack 而抓不到可靠快照。
547+
需要注意的是:如果未来新增线程绕过了 monitored helper,那么它即使进程里安装了 fatal handler,也仍可能因为没有 altstack 而抓不到可靠快照;同时它的线程名也不会自动进入 crash monitor 的线程本地缓存
533548

534549
---
535550

@@ -620,10 +635,13 @@ Stage-2 消费完成后会调用 `ftruncate()` 清空快照文件。因此如果
620635
- 顶层寄存器抓取;
621636
- frame-pointer 有界回溯;
622637
- 固定大小 crash snapshot ABI;
638+
- v2/v3 快照兼容消费;
623639
- 固定路径二进制快照写盘;
624640
- `/proc/self/maps` 模块缓存;
625-
- `modules[] / executable_path / module_index / rel_pc` 写入;
641+
- `modules[] / executable_path / thread_name / module_index / rel_pc` 写入;
626642
- 启动期旧快照消费;
643+
- Stage-2 crash summary 中的 task 名、可执行文件路径与 MD5 输出;
644+
- Stage-2 恢复日志前后分隔线输出;
627645
- Stage-2 ELF symbol 解析;
628646
- Stage-2 DWARF `file:line` 解析;
629647
- build-id aware external debuginfo 查找;

agent/src/ebpf/user/crash_monitor.c

Lines changed: 185 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,36 @@
7070
*/
7171

7272
#define CRASH_ALTSTACK_SIZE (64 * 1024)
73+
#define CRASH_SNAPSHOT_VERSION_V2 2
74+
75+
struct crash_snapshot_record_header {
76+
uint32_t magic;
77+
uint16_t version;
78+
uint16_t arch;
79+
uint32_t size;
80+
};
81+
82+
struct crash_snapshot_record_v2 {
83+
uint32_t magic;
84+
uint16_t version;
85+
uint16_t arch;
86+
uint32_t size;
87+
uint32_t signal;
88+
int32_t si_code;
89+
uint32_t pid;
90+
uint32_t tid;
91+
uint64_t fault_addr;
92+
uint64_t ip;
93+
uint64_t sp;
94+
uint64_t fp;
95+
uint64_t lr;
96+
uint64_t args[CRASH_SNAPSHOT_ARG_REGS];
97+
char executable_path[CRASH_SNAPSHOT_MODULE_PATH_LEN];
98+
uint32_t modules_count;
99+
uint32_t frames_count;
100+
struct crash_snapshot_module modules[CRASH_SNAPSHOT_MAX_MODULES];
101+
struct crash_snapshot_frame frames[CRASH_SNAPSHOT_MAX_FRAMES];
102+
};
73103

74104
/*
75105
* Process-wide and thread-local state used by the crash monitor.
@@ -118,6 +148,8 @@ static struct crash_snapshot_module crash_cached_modules[
118148
CRASH_SNAPSHOT_MAX_MODULES];
119149
static uint32_t crash_cached_modules_count;
120150
static char crash_cached_executable_path[CRASH_SNAPSHOT_MODULE_PATH_LEN];
151+
static char crash_cached_process_name[CRASH_SNAPSHOT_TASK_NAME_LEN];
152+
static __thread char crash_thread_name[CRASH_SNAPSHOT_TASK_NAME_LEN];
121153

122154
/*
123155
* Fatal signals considered interesting enough to capture. These all represent
@@ -242,6 +274,47 @@ static int crash_cache_executable_path(void)
242274
return ETR_OK;
243275
}
244276

277+
static void crash_cache_process_name(void)
278+
{
279+
ssize_t nread;
280+
int fd;
281+
282+
crash_cached_process_name[0] = '\0';
283+
fd = open("/proc/self/comm", O_RDONLY | O_CLOEXEC);
284+
if (fd < 0)
285+
return;
286+
nread = read(fd, crash_cached_process_name,
287+
sizeof(crash_cached_process_name) - 1);
288+
close(fd);
289+
if (nread <= 0) {
290+
crash_cached_process_name[0] = '\0';
291+
return;
292+
}
293+
crash_cached_process_name[nread] = '\0';
294+
crash_trim_trailing_newline(crash_cached_process_name);
295+
}
296+
297+
static void crash_copy_thread_name(char *dst, size_t dst_size)
298+
{
299+
if (dst == NULL || dst_size == 0)
300+
return;
301+
dst[0] = '\0';
302+
if (crash_thread_name[0] != '\0') {
303+
crash_copy_cstr(dst, dst_size, crash_thread_name);
304+
return;
305+
}
306+
crash_copy_cstr(dst, dst_size, crash_cached_process_name);
307+
}
308+
309+
void crash_monitor_set_thread_name(const char *name)
310+
{
311+
if (name == NULL || name[0] == '\0') {
312+
crash_thread_name[0] = '\0';
313+
return;
314+
}
315+
crash_copy_cstr(crash_thread_name, sizeof(crash_thread_name), name);
316+
}
317+
245318
static void crash_fill_module_build_id(struct crash_snapshot_module *module)
246319
{
247320
uint32_t build_id_size = 0;
@@ -290,6 +363,7 @@ static int crash_cache_modules(void)
290363
crash_cached_modules_count = 0;
291364
memset(crash_cached_modules, 0, sizeof(crash_cached_modules));
292365
(void)crash_cache_executable_path();
366+
crash_cache_process_name();
293367

294368
maps = fopen("/proc/self/maps", "r");
295369
if (maps == NULL)
@@ -354,6 +428,7 @@ static void crash_copy_cached_modules_to_record(struct crash_snapshot_record *re
354428
record->modules_count = crash_cached_modules_count;
355429
crash_copy_cstr(record->executable_path, sizeof(record->executable_path),
356430
crash_cached_executable_path);
431+
crash_copy_thread_name(record->thread_name, sizeof(record->thread_name));
357432
for (i = 0; i < crash_cached_modules_count; i++)
358433
crash_copy_module(&record->modules[i], &crash_cached_modules[i]);
359434
}
@@ -793,6 +868,105 @@ static int crash_install_signal_handlers(void)
793868
return ETR_OK;
794869
}
795870

871+
static void crash_upgrade_v2_record(struct crash_snapshot_record *dst,
872+
const struct crash_snapshot_record_v2 *src)
873+
{
874+
if (dst == NULL || src == NULL)
875+
return;
876+
877+
memset(dst, 0, sizeof(*dst));
878+
dst->magic = src->magic;
879+
dst->version = CRASH_SNAPSHOT_VERSION;
880+
dst->arch = src->arch;
881+
dst->size = sizeof(*dst);
882+
dst->signal = src->signal;
883+
dst->si_code = src->si_code;
884+
dst->pid = src->pid;
885+
dst->tid = src->tid;
886+
dst->fault_addr = src->fault_addr;
887+
dst->ip = src->ip;
888+
dst->sp = src->sp;
889+
dst->fp = src->fp;
890+
dst->lr = src->lr;
891+
crash_copy_bytes(dst->args, sizeof(dst->args), src->args,
892+
sizeof(src->args));
893+
crash_copy_cstr(dst->executable_path, sizeof(dst->executable_path),
894+
src->executable_path);
895+
dst->modules_count = src->modules_count;
896+
dst->frames_count = src->frames_count;
897+
crash_copy_bytes(dst->modules, sizeof(dst->modules), src->modules,
898+
sizeof(src->modules));
899+
crash_copy_bytes(dst->frames, sizeof(dst->frames), src->frames,
900+
sizeof(src->frames));
901+
}
902+
903+
static int crash_read_next_pending_record(int fd,
904+
struct crash_snapshot_record *record,
905+
ssize_t *nread_out)
906+
{
907+
struct crash_snapshot_record_header header;
908+
ssize_t nread;
909+
ssize_t remain;
910+
911+
if (record == NULL || nread_out == NULL)
912+
return ETR_INVAL;
913+
*nread_out = 0;
914+
915+
nread = read(fd, &header, sizeof(header));
916+
if (nread <= 0) {
917+
*nread_out = nread;
918+
return (nread == 0) ? ETR_OK : ETR_INVAL;
919+
}
920+
if (nread != sizeof(header)) {
921+
*nread_out = nread;
922+
return ETR_INVAL;
923+
}
924+
925+
if (header.magic != CRASH_SNAPSHOT_MAGIC) {
926+
*nread_out = sizeof(header);
927+
return ETR_NOTEXIST;
928+
}
929+
930+
if (header.version == CRASH_SNAPSHOT_VERSION &&
931+
header.size == sizeof(*record)) {
932+
struct crash_snapshot_record *on_disk = record;
933+
934+
memset(on_disk, 0, sizeof(*on_disk));
935+
on_disk->magic = header.magic;
936+
on_disk->version = header.version;
937+
on_disk->arch = header.arch;
938+
on_disk->size = header.size;
939+
remain = (ssize_t)sizeof(*on_disk) - (ssize_t)sizeof(header);
940+
nread = read(fd, (char *)on_disk + sizeof(header), (size_t)remain);
941+
*nread_out = sizeof(header) + nread;
942+
if (nread != remain)
943+
return ETR_INVAL;
944+
return ETR_OK;
945+
}
946+
947+
if (header.version == CRASH_SNAPSHOT_VERSION_V2 &&
948+
header.size == sizeof(struct crash_snapshot_record_v2)) {
949+
struct crash_snapshot_record_v2 old_record;
950+
951+
memset(&old_record, 0, sizeof(old_record));
952+
old_record.magic = header.magic;
953+
old_record.version = header.version;
954+
old_record.arch = header.arch;
955+
old_record.size = header.size;
956+
remain = (ssize_t)sizeof(old_record) - (ssize_t)sizeof(header);
957+
nread = read(fd, (char *)&old_record + sizeof(header),
958+
(size_t)remain);
959+
*nread_out = sizeof(header) + nread;
960+
if (nread != remain)
961+
return ETR_INVAL;
962+
crash_upgrade_v2_record(record, &old_record);
963+
return ETR_OK;
964+
}
965+
966+
*nread_out = sizeof(header);
967+
return ETR_NOTEXIST;
968+
}
969+
796970
static void crash_log_pending_record(const struct crash_snapshot_record *record)
797971
{
798972
if (record == NULL)
@@ -836,23 +1010,24 @@ int crash_monitor_consume_pending_snapshots(void)
8361010
return ETR_INVAL;
8371011
}
8381012

839-
while ((nread = read(fd, &record, sizeof(record))) == sizeof(record)) {
840-
if (record.magic != CRASH_SNAPSHOT_MAGIC ||
841-
record.version != CRASH_SNAPSHOT_VERSION ||
842-
record.size != sizeof(record)) {
1013+
for (;;) {
1014+
int ret = crash_read_next_pending_record(fd, &record, &nread);
1015+
1016+
if (ret == ETR_OK && nread == 0)
1017+
break;
1018+
if (ret == ETR_OK) {
1019+
crash_log_pending_record(&record);
1020+
continue;
1021+
}
1022+
if (ret == ETR_NOTEXIST) {
8431023
ebpf_warning("Discard invalid crash snapshot record from %s\n",
8441024
path);
8451025
continue;
8461026
}
847-
crash_log_pending_record(&record);
848-
}
849-
850-
if (nread < 0) {
8511027
close(fd);
8521028
return ETR_INVAL;
8531029
}
854-
if (nread != 0)
855-
ebpf_warning("Discard truncated crash snapshot file %s\n", path);
1030+
8561031
if (ftruncate(fd, 0) != 0) {
8571032
close(fd);
8581033
return ETR_INVAL;

0 commit comments

Comments
 (0)