概述

Kernel Address SANitizer(KASAN)是动态检测内存非法访问的内核工具,可以检测内核态常见内存访问错误,如:内存越界访问(out-of-bounds)、使用已释放内存(use-after-free)和重复释放(double-free)等。

Kasan利用额外的内存标记可用内存的状态,这部分额外的内存被称作shadow memory(影子区),KASAN将1/8的内存用作shadow memory。使用特殊的magic num填充shadow memory,在每一次load/store内存的时候检测对应的shadow memory确定操作是否valid。

kasan 检测原理详细分析见:一文搞懂Linux内核内存管理中的KASAN实现原理 - 知乎 (zhihu.com)

Kasan作为内核Debug工具,默认是关闭状态,使能后会导致镜像size增加25%以上,内存消耗增加,性能也会有较大下降。

下面以 RK3568 3.2release版本为例说明如何配置开启kasan检测:

1、修改内核配置,使能kasan

代码路径:./kernel/linux/config/linux-5.10/arch/arm64/configs/rk3568_standard_defconfig

添加kasan配置:

CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y

2、配置multi-shot模式:

kasan 默认为one-shot模式仅对第一次出现的故障打印错误日志,需设置multi-shot模式对每个故障都打印日志。

代码路径:./kernel/linux/linux-5.10/arch/arm64/mm/kasan_init.c

添加multi-shot调用:

kasan_init(void){
......
//multi-shot 设置
kasan_save_enable_multi_shot();
}

3、添加模拟故障测试样例:

Linux 内核的源码中已经包含了针对 KASan 的测试代码,其位置在 linux/lib/test_kasan.c,可以使用kunit测试框架对版本进行功能测试,也可以对测试代码改造后,直接对其测试。

3.1 方法1、编译模拟故障测试.o随系统启动:

在 ./kernel/linux/config/linux-5.10/arch/arm64/mm 目录下新建 kasan_test.c文件, 将test_kasan.c中的故障测试用例拷贝到kasan_test.c文件,添加kmalloc_tests_init() 函数依次调用测试用例,添加module_init(kmalloc_tests_init); 执行init方法。

static int __init kmalloc_tests_init(void){
        bool multishot = kasan_save_enable_multi_shot();
        kmalloc_oob_right();
        kmalloc_oob_left();
        kmalloc_node_oob_right();
#ifdef CONFIG_SLUB
        kmalloc_pagealloc_oob_right();
#endif
        kmalloc_large_oob_right();
        kmalloc_oob_krealloc_more();
        kmalloc_oob_krealloc_less();
        kmalloc_oob_16();
        kmalloc_oob_in_memset();
        kmalloc_oob_memset_2();
        kmalloc_oob_memset_4();
        kmalloc_oob_memset_8();
        kmalloc_oob_memset_16();	
	    kmalloc_uaf();
	    kmalloc_uaf_memset();
	    kmalloc_uaf2();
	    kmalloc_double_free_test();
        kasan_restore_multi_shot(multishot);
		return -EAGAIN;
}
module_init(kmalloc_tests_init);
MODULE_LICENSE("GPL");

kasan_test.c 下载链接:稳定性测试工具/tools/parseKmsgForKasan · 拉瓦尔空间/Laval_tools - 码云 - 开源中国 (gitee.com)

同时修改该目录下的Makefile 文件,添加kansa_test的编译。

obj-$(CONFIG_KASAN) +=kasan_init.o kasan_test.o

4、编译kasan版本镜像

./build.sh --product-name rk3568

5、故障日志获取

测试用例采用的3.1方法的随系统启动,自动执行,执行时hilog还未完成初始化导致日志无法落盘,只能是使用串口获取故障日志。

采用3.2或者kunit测试框架的需手动执行后再获取kmsg日志。

5.1使用kmsg 获取内核日志

烧写完kasan版本的镜像后,使用指令开启Kmsg日志落盘

hdc shell
hilog -w start -t kmsg -n 10 -l 100M 
reboot 

-n 10 : 代表可以缓存10份日志

-l 100M :表示每份日志文件100M

测试结束后将kmsg日志导出到PC

hdc shell 
cd data/log/hilog
tar -zcvf kmsg.tgz hilog_kmsg*
exit
hdc file recv data/log/hilog/kmsg.tgz ./kmsg.tgz

5.2 检查日志中kasan是否检测到故障:

KANSA 故障日志以关键字BUG: KASAN: 开头,以关键字 ====================================== 结束,可以根据关键字在日志文件中手动搜索,也可以使用脚本parsekmsgforkasan.py自动搜索:脚本下载地址:stability_testing_tools/tools/parseKmsgForKasan - Laval_tools - 拉瓦尔空间 (gitee.com)

6、故障日志分析

6.1 越界访问(out-of-bounds)

故障日志:

[    4.725817] BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0x5c/0x94
[    4.725859] Write of size 1 at addr ffffff8007ba957b by task swapper/0/1
[    4.725888] 
[    4.725931] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.97 #1
[    4.725965] Hardware name: rockchip,rk3568-toybrick-dev-linux-x0 (DT)
[    4.725998] Call trace:
[    4.726042]  dump_backtrace+0x0/0x2b0
[    4.726080]  show_stack+0x24/0x30
[    4.726124]  dump_stack+0x100/0x170
[    4.726172]  print_address_description+0x7c/0x510
[    4.726214]  kasan_report+0x164/0x1ac
[    4.726257]  __asan_store1+0x8c/0x90
[    4.726298]  kmalloc_oob_right+0x5c/0x94
[    4.726336]  kmalloc_tests_init+0x28/0x80
[    4.726378]  do_one_initcall+0x194/0x388
[    4.726421]  do_initcall_level+0x194/0x1c0
[    4.726461]  do_initcalls+0x5c/0xa0
[    4.726502]  do_basic_setup+0x74/0x88
[    4.726543]  kernel_init_freeable+0x180/0x1d0
[    4.726584]  kernel_init+0x20/0x128
[    4.726625]  ret_from_fork+0x10/0x30
[    4.726652] 
[    4.726684] Allocated by task 1:
[    4.726727]  kasan_save_stack+0x38/0x68
[    4.726768]  __kasan_kmalloc+0xd4/0xfc
[    4.726808]  kasan_kmalloc+0x10/0x1c
[    4.726848]  kmem_cache_alloc_trace+0x1fc/0x2e8
[    4.726887]  kmalloc_oob_right+0x4c/0x94
[    4.726925]  kmalloc_tests_init+0x28/0x80
[    4.726964]  do_one_initcall+0x194/0x388
[    4.727005]  do_initcall_level+0x194/0x1c0
[    4.727045]  do_initcalls+0x5c/0xa0
[    4.727085]  do_basic_setup+0x74/0x88
[    4.727125]  kernel_init_freeable+0x180/0x1d0
[    4.727165]  kernel_init+0x20/0x128
[    4.727204]  ret_from_fork+0x10/0x30
[    4.727230] 
[    4.727271] The buggy address belongs to the object at ffffff8007ba9500
[    4.727271]  which belongs to the cache kmalloc-128 of size 128
[    4.727316] The buggy address is located 123 bytes inside of
[    4.727316]  128-byte region [ffffff8007ba9500, ffffff8007ba9580)
[    4.727351] The buggy address belongs to the page:
[    4.727398] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7ba9
[    4.727437] flags: 0x200(slab)
[    4.727491] raw: 0000000000000200 dead000000000100 dead000000000122 ffffff8000203c80
[    4.727540] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[    4.727574] page dumped because: kasan: bad access detected
[    4.727603] 
[    4.727632] Memory state around the buggy address:
[    4.727675]  ffffff8007ba9400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[    4.727716]  ffffff8007ba9480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.727757] >ffffff8007ba9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03
[    4.727790]                                                                 ^
[    4.727830]  ffffff8007ba9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.727869]  ffffff8007ba9600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.727904] ==================================================================

1、故障日志分析如下:

1)、“ BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0x5c/0x94 ”表示 kmalloc_oob_right 函数的0x5c(反汇编之后)位置存在内存越界问题

2)、“Write of size 1 at addr ffffff8007ba957b”表示故障为写内存越界,写的内存大小为 1字节,内存越界的地址为ffffff8007ba957b

2、反汇编确定故障代码位置:

1)、拷贝vmlinux文件(./out/kernel/OBJ/linux-5.10/vmlinux) 到 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin中

2)、cd 目录 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin 中,执行./aarch64-linux-gnu-gdb vmlinux 

3)、输入 list *kmalloc_oob_right+0x5c 输出:故障位置在【25 行 ptr[size] = 'x' 】 

0xffffffd0125eb758 is in kmalloc_oob_right(../../src_tmp/linux-5.10/arch/arm64/mm/kasan_test.c:25).
20        if (!ptr) {
21                pr_err("Allocation failed\n");
22                return;
23        }
24
25        ptr[size] = 'x';
26        kfree(ptr);
27 }
28 
29 static noinline void __init kmalloc_oob_left(void)

3、分析故障函数附近代码调用,确定故障原因并修改。

16行定义size= 123,25行ptr[size] = 'x'; 导致访问内存越界

6.2 访问已经释放的内存(use-after-free)

故障日志:

[    4.753359] ==================================================================
[    4.753404] BUG: KASAN: use-after-free in kmalloc_uaf+0x60/0x90
[    4.753443] Write of size 1 at addr ffffff8007ba9f08 by task swapper/0/1
[    4.753471] 
[    4.753512] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.10.97 #1
[    4.753545] Hardware name: rockchip,rk3568-toybrick-dev-linux-x0 (DT)
[    4.753576] Call trace:
[    4.753615]  dump_backtrace+0x0/0x2b0
[    4.753654]  show_stack+0x24/0x30
[    4.753694]  dump_stack+0x100/0x170
[    4.753737]  print_address_description+0x7c/0x510
[    4.753779]  kasan_report+0x164/0x1ac
[    4.753821]  __asan_store1+0x8c/0x90
[    4.753860]  kmalloc_uaf+0x60/0x90
[    4.753899]  kmalloc_tests_init+0x58/0x80
[    4.753940]  do_one_initcall+0x194/0x388
[    4.753982]  do_initcall_level+0x194/0x1c0
[    4.754022]  do_initcalls+0x5c/0xa0
[    4.754063]  do_basic_setup+0x74/0x88
[    4.754105]  kernel_init_freeable+0x180/0x1d0
[    4.754145]  kernel_init+0x20/0x128
[    4.754186]  ret_from_fork+0x10/0x30
[    4.754213] 
[    4.754243] Allocated by task 1:
[    4.754286]  kasan_save_stack+0x38/0x68
[    4.754327]  __kasan_kmalloc+0xd4/0xfc
[    4.754368]  kasan_kmalloc+0x10/0x1c
[    4.754407]  kmem_cache_alloc_trace+0x1fc/0x2e8
[    4.754445]  kmalloc_uaf+0x4c/0x90
[    4.754483]  kmalloc_tests_init+0x58/0x80
[    4.754521]  do_one_initcall+0x194/0x388
[    4.754562]  do_initcall_level+0x194/0x1c0
[    4.754603]  do_initcalls+0x5c/0xa0
[    4.754642]  do_basic_setup+0x74/0x88
[    4.754682]  kernel_init_freeable+0x180/0x1d0
[    4.754721]  kernel_init+0x20/0x128
[    4.754760]  ret_from_fork+0x10/0x30
[    4.754787] 
[    4.754816] Freed by task 1:
[    4.754855]  kasan_save_stack+0x38/0x68
[    4.754896]  kasan_set_track+0x28/0x3c
[    4.754937]  kasan_set_free_info+0x24/0x48
[    4.754978]  __kasan_slab_free+0x120/0x150
[    4.755018]  kasan_slab_free+0x14/0x24
[    4.755055]  kfree+0x1b0/0x49c
[    4.755093]  kmalloc_uaf+0x58/0x90
[    4.755132]  kmalloc_tests_init+0x58/0x80
[    4.755171]  do_one_initcall+0x194/0x388
[    4.755212]  do_initcall_level+0x194/0x1c0
[    4.755252]  do_initcalls+0x5c/0xa0
[    4.755291]  do_basic_setup+0x74/0x88
[    4.755331]  kernel_init_freeable+0x180/0x1d0
[    4.755370]  kernel_init+0x20/0x128
[    4.755409]  ret_from_fork+0x10/0x30
[    4.755435] 
[    4.755472] The buggy address belongs to the object at ffffff8007ba9f00
[    4.755472]  which belongs to the cache kmalloc-128 of size 128
[    4.755515] The buggy address is located 8 bytes inside of
[    4.755515]  128-byte region [ffffff8007ba9f00, ffffff8007ba9f80)
[    4.755548] The buggy address belongs to the page:
[    4.755591] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7ba9
[    4.755628] flags: 0x200(slab)
[    4.755677] raw: 0000000000000200 dead000000000100 dead000000000122 ffffff8000203c80
[    4.755725] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[    4.755758] page dumped because: kasan: bad access detected
[    4.755784] 
[    4.755813] Memory state around the buggy address:
[    4.755852]  ffffff8007ba9e00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[    4.755892]  ffffff8007ba9e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.755932] >ffffff8007ba9f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[    4.755963]                       ^
[    4.756000]  ffffff8007ba9f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.756040]  ffffff8007baa000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    4.756071] ==================================================================

1、故障日志分析如下:

1)、“ BUG: KASAN: use-after-free in kmalloc_uaf+0x60/0x90 ”表示 kmalloc_uaf 函数的0x60(反汇编之后)位置存在访问已释放内存的问题

2)、“Write of size 1 at addr ffffff8007ba9f08”表示故障为写内存越界,写的内存大小为 1字节,内存越界的地址为ffffff8007ba9f08

2、反汇编确定故障代码位置:

1)、拷贝vmlinux文件(./out/kernel/OBJ/linux-5.10/vmlinux) 到 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin中

2)、cd 目录 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin 中,执行./aarch64-linux-gnu-gdb vmlinux 

3)、输入 list *kmalloc_uaf+0x60 输出:故障位置在【251 行 *(ptr + 8)= 'x' 】 

0xffffffd0125ebeac is in kmalloc_uaf(../../src_tmp/linux-5.10/arch/arm64/mm/kasan_test.c:251).
246        pr_err("Allocation failed\n");
247        return;
248    }
249
250    kfree(ptr);
251    *(ptr + 8) = 'x';
252 }
253
254 static noinline void __init kmalloc_uaf_memset(void)
255 {

3、分析故障函数代码调用,确定故障原因并修改。

代码250行已释放ptr,单251行继续 使用ptr,导致访问已释放内存故障。

6.3 重复释放(double-free)

故障日志:

[    4.762071] BUG: KASAN: double-free or invalid-free in kmalloc_double_free_test+0x60/0x88
[    4.762099] 
[    4.762142] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.10.97 #1
[    4.762175] Hardware name: rockchip,rk3568-toybrick-dev-linux-x0 (DT)
[    4.762205] Call trace:
[    4.762244]  dump_backtrace+0x0/0x2b0
[    4.762282]  show_stack+0x24/0x30
[    4.762322]  dump_stack+0x100/0x170
[    4.762365]  print_address_description+0x7c/0x510
[    4.762410]  kasan_report_invalid_free+0x64/0x90
[    4.762452]  __kasan_slab_free+0xd8/0x150
[    4.762494]  kasan_slab_free+0x14/0x24
[    4.762531]  kfree+0x1b0/0x49c
[    4.762572]  kmalloc_double_free_test+0x60/0x88
[    4.762610]  kmalloc_tests_init+0x64/0x80
[    4.762650]  do_one_initcall+0x194/0x388
[    4.762691]  do_initcall_level+0x194/0x1c0
[    4.762732]  do_initcalls+0x5c/0xa0
[    4.762772]  do_basic_setup+0x74/0x88
[    4.762813]  kernel_init_freeable+0x180/0x1d0
[    4.762854]  kernel_init+0x20/0x128
[    4.762895]  ret_from_fork+0x10/0x30
[    4.762922] 
[    4.762952] Allocated by task 1:
[    4.762992]  kasan_save_stack+0x38/0x68
[    4.763032]  __kasan_kmalloc+0xd4/0xfc
[    4.763072]  kasan_kmalloc+0x10/0x1c
[    4.763111]  kmem_cache_alloc_trace+0x1fc/0x2e8
[    4.763150]  kmalloc_double_free_test+0x4c/0x88
[    4.763188]  kmalloc_tests_init+0x64/0x80
[    4.763227]  do_one_initcall+0x194/0x388
[    4.763268]  do_initcall_level+0x194/0x1c0
[    4.763308]  do_initcalls+0x5c/0xa0
[    4.763348]  do_basic_setup+0x74/0x88
[    4.763388]  kernel_init_freeable+0x180/0x1d0
[    4.763427]  kernel_init+0x20/0x128
[    4.763467]  ret_from_fork+0x10/0x30
[    4.763493] 
[    4.763522] Freed by task 1:
[    4.763562]  kasan_save_stack+0x38/0x68
[    4.763601]  kasan_set_track+0x28/0x3c
[    4.763643]  kasan_set_free_info+0x24/0x48
[    4.763684]  __kasan_slab_free+0x120/0x150
[    4.763725]  kasan_slab_free+0x14/0x24
[    4.763762]  kfree+0x1b0/0x49c
[    4.763802]  kmalloc_double_free_test+0x58/0x88
[    4.763840]  kmalloc_tests_init+0x64/0x80
[    4.763879]  do_one_initcall+0x194/0x388
[    4.763919]  do_initcall_level+0x194/0x1c0
[    4.763959]  do_initcalls+0x5c/0xa0
[    4.763998]  do_basic_setup+0x74/0x88
[    4.764039]  kernel_init_freeable+0x180/0x1d0
[    4.764079]  kernel_init+0x20/0x128
[    4.764119]  ret_from_fork+0x10/0x30
[    4.764146] 
[    4.764183] The buggy address belongs to the object at ffffff8007bab300
[    4.764183]  which belongs to the cache kmalloc-128 of size 128
[    4.764228] The buggy address is located 0 bytes inside of
[    4.764228]  128-byte region [ffffff8007bab300, ffffff8007bab380)
[    4.764260] The buggy address belongs to the page:
[    4.764303] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7bab
[    4.764339] flags: 0x200(slab)
[    4.764388] raw: 0000000000000200 dead000000000100 dead000000000122 ffffff8000203c80
[    4.764437] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[    4.764469] page dumped because: kasan: bad access detected
[    4.764496] 
[    4.764524] Memory state around the buggy address:
[    4.764562]  ffffff8007bab200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[    4.764602]  ffffff8007bab280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.764642] >ffffff8007bab300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[    4.764673]                    ^
[    4.764711]  ffffff8007bab380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.764752]  ffffff8007bab400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    4.764784] ==================================================================

1、故障日志分析如下:

1)、“ BUG: KASAN: double-free or invalid-free in kmalloc_double_free_test+0x60/0x88 ”表示 kmalloc_double_free_test函数的0x60(反汇编之后)位置存在重复释放内存的问题

2、反汇编确定故障代码位置:

1)、拷贝vmlinux文件(./out/kernel/OBJ/linux-5.10/vmlinux) 到 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin中

2)、cd 目录 prebuilts/gcc/linux-x86/aarch64/gcc-linaro-7.5.0-2019.12-x86_64-aarch64-linux-gnu/bin 中,执行./aarch64-linux-gnu-gdb vmlinux 

3)、输入 list *kmalloc_double_free_test+0x60 输出:故障位置在【308行 kfree(ptr); 】 

0xffffffd0125ec0a4 is in kmalloc_double_free_test(../../src_tmp/linux-5.10/arch/arm64/mm/kasan_test.c:308).
304        return;
305    }
306
307    kfree(ptr);
308    kfree(ptr);
309 }
310
311 static int __init kmalloc_tests_init(void)
312 {

3、分析故障函数代码调用,确定故障原因并修改。

代码307,308行重复师释放ptr内存

 

 

Logo

社区规范:仅讨论OpenHarmony相关问题。

更多推荐