Sanitizer

Sanitizer是由Google发起的开源工具集,用于检测内存泄露等问题。它包括了AddressSanitizer、MemorySanitizer、ThreadSanitizer、LeakSanitizer等多种工具。这些工具最初是LLVM项目的一部分,后来也被GNU的GCC编译器支持。从GCC的4.8版本开始,就已经支持AddressSanitizer和ThreadSanitizer,而4.9版本则开始支持LeakSanitizer。

  • AddressSanitizer,检测内存访问问题
  • MemorySanitizer,检测未初始化内存问题
  • ThreadSanitizer,检测线程竞态和死锁问题
  • LeakSanitizer,检测内存泄露问题

编译器自带。相比于检测工具valgrind,它对程序性能的影响更小,用过valgrind的就知道,使用valgrind后程序的性能大大降低。

官方链接:https://link.zhihu.com/?target=https%3A//github.com/google/sanitizers/wiki/

AddressSanitizer

原理介绍
AddressSanitizer(简称ASan)是一种内存错误检测器,它可以检测出各种内存相关的错误,包括内存泄漏。在Android NDK中,我们可以通过在编译选项中添加-fsanitize=address来启用ASan。ASan会在程序运行时监控内存操作,当检测到内存泄漏时,会打印出详细的错误信息,包括泄漏的大小、位置和堆栈信息。

AddressSanitizer的原理

  • 内存布局变换:ASan在编译时改变程序的内存布局,使得程序中的每个对象(变量、数组等)周围都有一些额外的“红色区域”(redzones)。这些红色区域用于检测内存访问越界。例如,如果一个数组的访问越过了它的边界并访问了红色区域,ASan就会报告一个缓冲区溢出错误。

  • 影子内存:ASan使用影子内存(shadow memory)来跟踪程序中的每个内存字节的状态。影子内存是程序内存的一个映射,用于存储有关内存状态的元数据,如内存是否已分配、是否已初始化等。当程序访问内存时,ASan会检查对应的影子内存,以确定访问是否合法。

  • 编译器插桩:ASan通过编译器插桩(instrumentation)在程序中插入检查代码。这些检查代码在内存访问发生时执行,以检测潜在的内存错误。例如,ASan会在堆分配和释放函数(如malloc和free)中插入代码,以检测内存泄漏和使用已释放的内存。

工具使用

ASan是一个C/C++的内存错误检测工具,它能够检测:

  • Use after free,悬空指针引用,也叫野指针引用,即对free后的内存进行存取操作。
  • Heap buffer overflow,堆溢出访问,即对堆的操作越界了。
  • Stack buffer overflow,栈溢出访问,即对栈的操作越界了。
  • Global buffer overflow,全局缓冲区溢出,例如全局的数组这些,反正都是越界访问,只不过位于程序的不同segment。
  • Use after return,即栈的野指针,例如A函数调用B函数,A函数有个指针传入到B中,B把它赋值指向了B的一个局部变量,即栈变量,B返回到A后,A操作该指针的错误。
  • Use after scope,和Use after return有点类似,C/C++的{}表示一个scope。
  • Initialization order bugs
  • Memory leaks,内存泄露,AddressSanitizer集成了LeakSanitizer的功能。

Use after free

1
2
3
4
5
6
7
8
9
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *p = malloc(10);
free(p);
p[0] = 1;
return 0;
}

运行

1
2
3
4
5
6
7
(base) sv@sv-NF5280M5:/home/sv/桌面$ g++ -g -fsanitize=address main.cpp -fpermissive
main.cpp: In function ‘int main(int, char**)’:
main.cpp:5:21: warning: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive]
5 | char *p = malloc(10);
| ~~~~~~^~~~
| |
| void*

Note:-g是一个选项,用来为生成的可执行文件添加调试信息。

查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
(base) sv@sv-NF5280M5:/home/sv/桌面$ ./a.out
=================================================================
==1336741==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000010 at pc 0x55bced57822a bp 0x7ffc50d251d0 sp 0x7ffc50d251c0
WRITE of size 1 at 0x602000000010 thread T0
#0 0x55bced578229 in main /home/sv/桌面/main.cpp:7
#1 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308
#2 0x55bced57810d in _start (/home/sv/桌面/a.out+0x110d)

0x602000000010 is located 0 bytes inside of 10-byte region [0x602000000010,0x60200000001a)
freed by thread T0 here:
#0 0x7faf0dfbf40f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122
#1 0x55bced5781f5 in main /home/sv/桌面/main.cpp:6
#2 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308

previously allocated by thread T0 here:
#0 0x7faf0dfbf808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x55bced5781e5 in main /home/sv/桌面/main.cpp:5
#2 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308

SUMMARY: AddressSanitizer: heap-use-after-free /home/sv/桌面/main.cpp:7 in main
Shadow bytes around the buggy address:
0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa[fd]fd fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==1336741==ABORTING

显然,存在 heap-use-after-free。

仔细分析下上述信息:

1.进程号,错误类型,操作是读,还是写,操作的地址,线程号等,以及栈的回溯信息

1
2
3
4
5
==1336741==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000010 at pc 0x55bced57822a bp 0x7ffc50d251d0 sp 0x7ffc50d251c0
WRITE of size 1 at 0x602000000010 thread T0
#0 0x55bced578229 in main /home/sv/桌面/main.cpp:7
#1 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308
#2 0x55bced57810d in _start (/home/sv/桌面/a.out+0x110d)

2.对此块内存的操作的具体位置,对10字节的内存区域[0x602000000010,0x60200000001a),的第0个字节操作。

1
2
3
4
5
0x602000000010 is located 0 bytes inside of 10-byte region [0x602000000010,0x60200000001a)
freed by thread T0 here:
#0 0x7faf0dfbf40f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122
#1 0x55bced5781f5 in main /home/sv/桌面/main.cpp:6
#2 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308

3.此块内存区域在那个线程,哪个地方分配的

1
2
3
4
previously allocated by thread T0 here:
#0 0x7faf0dfbf808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x55bced5781e5 in main /home/sv/桌面/main.cpp:5
#2 0x7faf0d998082 in __libc_start_main ../csu/libc-start.c:308

Heap buffer overflow

1
2
3
4
5
6
7
8
9
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *p = malloc(10);
p[10] = 1;
free(p);
return 0;
}

运行

1
2
3
4
5
6
7
(base) sv@sv-NF5280M5:/home/sv/桌面$ g++ -g -fsanitize=address main.cpp -fpermissive
main.cpp: In function ‘int main(int, char**)’:
main.cpp:5:21: warning: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive]
5 | char *p = malloc(10);
| ~~~~~~^~~~
| |
| void*

查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
(base) sv@sv-NF5280M5:/home/sv/桌面$ ./a.out
=================================================================
==1336960==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000001a at pc 0x5563af190226 bp 0x7fff4fbb4d30 sp 0x7fff4fbb4d20
WRITE of size 1 at 0x60200000001a thread T0
#0 0x5563af190225 in main /home/sv/桌面/main.cpp:6
#1 0x7f88cfff5082 in __libc_start_main ../csu/libc-start.c:308
#2 0x5563af19010d in _start (/home/sv/桌面/a.out+0x110d)

0x60200000001a is located 0 bytes to the right of 10-byte region [0x602000000010,0x60200000001a)
allocated by thread T0 here:
#0 0x7f88d061c808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x5563af1901e5 in main /home/sv/桌面/main.cpp:5
#2 0x7f88cfff5082 in __libc_start_main ../csu/libc-start.c:308

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/sv/桌面/main.cpp:6 in main
Shadow bytes around the buggy address:
0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00[02]fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==1336960==ABORTING

Stack buffer overflow

1
2
3
4
5
6
7
#include <stdlib.h>
int main(int argc, char *argv[])
{
char p[10];
p[10] = 1;
return 0;
}

查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
=================================================================
==26927==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffe0389d7a at pc 0x5577fcfd7289 bp 0x7fffe0389d30 sp 0x7fffe0389d20
WRITE of size 1 at 0x7fffe0389d7a thread T0
#0 0x5577fcfd7288 in main /home/thomas/test/ctest/main.c:5
#1 0x7f97e9d0d082 in __libc_start_main ../csu/libc-start.c:308
#2 0x5577fcfd710d in _start (/home/thomas/test/ctest/a.out+0x110d)

Address 0x7fffe0389d7a is located in stack of thread T0 at offset 42 in frame
#0 0x5577fcfd71d8 in main /home/thomas/test/ctest/main.c:3

This frame has 1 object(s):
[32, 42) 'p' (line 4) <== Memory access at offset 42 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/thomas/test/ctest/main.c:5 in main
...

Global buffer overflow

1
2
3
4
5
6
7
#include <stdlib.h>
char p[10]={0};
int main(int argc, char *argv[])
{
p[10] = 1;
return 0;
}

编译,运行输出:

1
2
3
4
5
6
7
8
9
10
=================================================================
==27271==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5632b22930aa at pc 0x5632b2290213 bp 0x7ffeba0b5140 sp 0x7ffeba0b5130
WRITE of size 1 at 0x5632b22930aa thread T0
#0 0x5632b2290212 in main /home/thomas/test/ctest/main.c:5
#1 0x7f90fd627082 in __libc_start_main ../csu/libc-start.c:308
#2 0x5632b229010d in _start (/home/thomas/test/ctest/a.out+0x110d)

0x5632b22930aa is located 0 bytes to the right of global variable 'p' defined in 'main.c:2:6' (0x5632b22930a0) of size 10
SUMMARY: AddressSanitizer: global-buffer-overflow /home/thomas/test/ctest/main.c:5 in main
...

。。。

LeakSanitizer

LeakSanitizer是一个强大的内存泄漏检测工具,主要用于C/C++程序的内存泄漏问题诊断。它通过在程序运行时监控动态内存分配和释放的行为,帮助开发者快速定位和解决内存泄漏问题。LeakSanitizer是Clang/LLVM编译器套件的一部分,与GCC编译器的内存泄漏检测工具Valgrind互为补充。

示例

1
2
3
4
5
6
7
8
9
10
11
#include <stdlib.h>

void foo() {
int* ptr = (int *)malloc(sizeof(int)); // 分配内存
// ptr没有被释放
}

int main() {
foo();
return 0;
}

运行

1
(base) sv@sv-NF5280M5:/home/sv/桌面$ clang -g -fsanitize=leak main.cpp -fpermissive

查看【Note:g++编译可能会出错~】

推荐阅读:

  1. 详解三大编译器:gcc、llvm 和 clang - 知乎专栏
  2. GCC vs Clang: 两大编译器巨头的龙争虎斗 - 知乎专栏
  3. GCC与Clang / LLVM:C / C ++编译器的深度比较 - findumars - 博客园
  4. clang 与 GCC 的区别 - 知乎 - 知乎专栏
  5. Clang - 维基百科,自由的百科全书
1
2
3
4
5
6
7
8
9
10
11
12
(base) sv@sv-NF5280M5:/home/sv/桌面$ ./a.out

=================================================================
==1338908==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x407925 in malloc (/home/sv/桌面/a.out+0x407925)
#1 0x426dc1 in foo() /home/sv/桌面/main.cpp:4:23
#2 0x426de3 in main /home/sv/桌面/main.cpp:9:5
#3 0x7f0251d2e082 in __libc_start_main /build/glibc-e2p3jK/glibc-2.31/csu/../csu/libc-start.c:308:16

SUMMARY: LeakSanitizer: 4 byte(s) leaked in 1 allocation(s).

注意事项

  1. LeakSanitizer主要针对动态内存分配的检测,对于静态分配或全局分配的内存泄漏无能为力。
  2. 启用LeakSanitizer可能会对程序性能产生一定影响,因此通常在开发和测试阶段使用,而不推荐在生产环境中持续启用。
  3. 在使用LeakSanitizer时,可能会遇到各种问题,如初始化失败、缺少依赖库等。这些问题通常需要根据具体的错误信息进行排查和解决。

AddressSanitizer和LeakSanitizer的区别

AddressSanitizer(ASan)和LeakSanitizer(LSan)都是用于内存错误检测的工具,它们的主要区别在于检测的问题类型和应用场景。

  1. AddressSanitizer(ASan):
    • ASan是一种用于检测内存错误的工具,包括内存访问越界、使用释放的内存、堆栈溢出等问题。
    • ASan能够在编译时插额外的运行时检查代码,对进行动态分析,提供详细的信息和错误的位置。
    • ASan要用于发现和调试内存相关的问题,可以帮助开发者早发现和修内存错误。
  2. LeakSanitizer(LS):
    • LSan是一种用于检测内存泄漏的工具,主要用于检测程序中的动态内存分配和没有释放的内存。
    • LSan通过追踪内存分配和释放操作,检测出未释放的内存并报告泄漏的位置和类型。
    • LSan主要用于发现内存泄漏问题,帮助开发者查找未释放的内存资源,优化内存使用效率。

ASan主要用于检测内存错误,如越界访问和释放后,而LSan主用于检测内存泄漏问题。它们都能够在编译时插入额外的运行时检查代码,帮助开发者发现修复内存相关问题。

参考资料: