kasan排查kernel内存越界示例(linux5.18.11)

参考资料:

1,内核源码目录中的Documentation\dev-tools\kasan.rst

2,KASAN - Kernel Address Sanitizer | Naveen Naidu (naveenaidu.dev)

一、kasan实现原理

KASAN(Kernel Address SANitizer)是一个动态内存非法访问检测工具. 可以检测 use-after-free 和out-of-bounds两类错误。

KASAN将内存按8字节分一组,每组用一个额外的字节(shadow mem)来记录可访问的字节数。

shadow mem的值:
1)0,8 bytes内存都是可以访问的。
2)N (1 <= N <= 7) ,8 bytes内存的前N个字节可以访问。
3)为负数,8 bytes内存都不可访问,原因见mm/kasan/kasan.h。

mm/kasan/kasan.h

#define KASAN_FREE_PAGE         0xFF  /* page was freed */
#define KASAN_PAGE_REDZONE      0xFE  /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE   0xFC  /* redzone inside slub object */
#define KASAN_KMALLOC_FREE      0xFB  /* object was freed (kmem_cache_free/kfree) */
#define KASAN_VMALLOC_INVALID   0xF8  /* unallocated space in vmapped page */

二、内存越界示例

代码片段:

1197 static long do_sys_openat2(int dfd, const char __user *filename,
1198                            struct open_how *how)
1199 {
1200         struct open_flags op;
1201         int fd = build_open_flags(how, &op);
1202         struct filename *tmp;
1203         int *kasan;
1204         int i;
1205 
1206         if (fd)
1207                 return fd;
1208 
1209         tmp = getname(filename);
1210         if (IS_ERR(tmp))
1211                 return PTR_ERR(tmp);
1212 
1213         if (!strcmp(tmp->name, "a")) {
1214                 kasan = kmalloc(100, GFP_KERNEL);
1215                 if (kasan) {
1216                         for (i=0; i < 200; i++)
1217                                 *kasan++ = 'a';
1218                         printk("%s %d: kasan test finish\n", __func__, __LINE__);
1219                 }
1220         }
1221 

代码触发的异常log:

cat进程(pid=158),往地址ffff888000949d64写4个字节时,发生内存越界。
[  103.282480] BUG: KASAN: slab-out-of-bounds in do_sys_openat2+0x453/0x4d0
[  103.283613] Write of size 4 at addr ffff888000949d64 by task cat/158
[  103.284310] 
[  103.284928] CPU: 0 PID: 158 Comm: cat Not tainted 5.18.11-gac599649f534-dirty #125
[  103.285796] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014

调用栈:
[  103.287065] Call Trace:
[  103.287479]  <TASK>
[  103.287827]  dump_stack_lvl+0x34/0x44
[  103.288366]  print_report.cold+0xb2/0x6b7
[  103.288995]  ? do_sys_openat2+0x453/0x4d0
[  103.289480]  kasan_report+0xa9/0x120
[  103.289889]  ? do_sys_openat2+0x453/0x4d0
[  103.290218]  do_sys_openat2+0x453/0x4d0
[  103.290554]  ? file_open_root+0x210/0x210
[  103.290978]  do_sys_open+0x85/0xe0
[  103.291329]  ? filp_open+0x50/0x50
[  103.291658]  ? fpregs_assert_state_consistent+0x50/0x60
[  103.292056]  ? __x64_sys_open+0x2a/0x50
[  103.292369]  do_syscall_64+0x3b/0x90
[  103.292691]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.293300] RIP: 0033:0x7fbd3899d0bc
[  103.294022] Code: 10 00 00 00 8b 54 24 50 48 89 44 24 30 48 8d 44 24 40 48 89 44 24 38 83 3d 80 a9 07 00 00 48 63 f6 75 21 b8 02 00 00 00 3
[  103.295888] RSP: 002b:00007fff7173b970 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  103.297227] RAX: ffffffffffffffda RBX: 00007fff7173bc70 RCX: 00007fbd3899d0bc
[  103.298154] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fff7173cf54
[  103.298949] RBP: 00007fff7173cf54 R08: 00007fff7173cfe0 R09: 0000000000000000
[  103.299716] R10: 00007fbd38a192b0 R11: 0000000000000246 R12: 0000000000000000
[  103.300461] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fff7173bc70
[  103.301269]  </TASK>
[  103.301672] 

pid=158的进程分配被越界的内存
[  103.301975] Allocated by task 158:
[  103.302520]  kasan_save_stack+0x1e/0x40

do_sys_open --> kmalloc分配了这块内存
[  103.303045]  __kasan_kmalloc+0x81/0xa0
[  103.303475]  do_sys_openat2+0x434/0x4d0
[  103.303895]  do_sys_open+0x85/0xe0
[  103.304215]  do_syscall_64+0x3b/0x90
[  103.304596]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.305212] 
[  103.305516] Last potentially related work creation:
[  103.306069]  kasan_save_stack+0x1e/0x40
[  103.306580]  __kasan_record_aux_stack+0x97/0xa0
[  103.307259]  call_rcu+0x41/0x4c0
[  103.307773]  __inet_insert_ifa+0x3e0/0x4b0
[  103.308288]  devinet_ioctl+0x767/0xb20
[  103.308742]  inet_ioctl+0x24e/0x280
[  103.309210]  sock_do_ioctl+0xb4/0x190
[  103.309719]  sock_ioctl+0x2b1/0x3e0
[  103.310210]  __x64_sys_ioctl+0xb4/0xf0
[  103.310670]  do_syscall_64+0x3b/0x90
[  103.311088]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.311696] 

越界的内存块区间为 [ffff888000949d00, ffff888000949d80),该内存块含128个字节(0x49d80 - 0x49d00 = 128)。kmalloc申请100个字节,从kmalloc-128 cache中分配。

[  103.311961] The buggy address belongs to the object at ffff888000949d00
[  103.311961]  which belongs to the cache kmalloc-128 of size 128

从1开始计数,ffff888000949d64在这个区间的第100个byte地址处(0x49d64 - 0x49d00 = 100)
[  103.313130] The buggy address is located 100 bytes inside of
[  103.313130]  128-byte region [ffff888000949d00, ffff888000949d80)
[  103.314341] 

[  103.314742] The buggy address belongs to the physical page:
[  103.315549] page:00000000975ebcd3 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x949
[  103.317102] flags: 0x200(slab|node=0|zone=0)
[  103.318836] raw: 0000000000000200 ffffea0000022240 dead000000000004 ffff8880048418c0
[  103.319675] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[  103.320536] page dumped because: kasan: bad access detected
[  103.321099] 
[  103.321319] Memory state around the buggy address:
[  103.322273]  ffff888000949c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  103.323105]  ffff888000949c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

kasan将内存按8 bytes分为一组,额外用1 byte内存(称作shadow memory)记录每组内存的可访问字节数,下面log中看到的每个字节就是shadow memory的值。第1个位置是00,表示第1组中8字节都可以访问;第13个位置是04,表示13组中前4个字节可以访问。

第100个byte属于第13组的第4个(100 / 8 = 12...4),log中^指向的04表示改组中前4个字节可以访问。

而我们代码1216行,通过for循环会访问到第5个字节,所以触发异常。
[  103.323937] >ffff888000949d00: 00 00 00 00 00 00 00 00 00 00 00 00 04 fc fc fc
[  103.324687]                                                                                               ^
[  103.325414]  ffff888000949d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  103.326049]  ffff888000949e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
[  103.326727]

==================================================================
[  103.328184] Disabling lock debugging due to kernel taint
[  103.329028] do_sys_openat2 1218: kasan test finish

三、解析出异常对应的代码行号

如果编译的内核带有debug信息,CONFIG_DEBUG_KERNEL=y 或者选中Kernel hacking --->Kernel debugging,则可用kernel自带的decode_stacktrace.sh脚本解析出行号信息。

命令格式:
decode_stacktrace.sh  vmlinux路径  kernel源码路径  <  crash文件路径  > output.log

root@linux:/home/gsf/debug/kernel/linux-5.18.11# ./scripts/decode_stacktrace.sh /home/gsf/debug/kernel/linux-5.18.11/vmlinux < /home/gsf/kernel.crash

……
[  103.281530] 
[  103.282480] BUG: KASAN: slab-out-of-bounds in do_sys_openat2 (fs/open.c:1217 (discriminator 3))
[  103.283613] Write of size 4 at addr ffff888000949d64 by task cat/158
[  103.284310]
[  103.284928] CPU: 0 PID: 158 Comm: cat Not tainted 5.18.11-gac599649f534-dirty #125
[  103.285796] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  103.287065] Call Trace:
[  103.287479]  <TASK>
[  103.287827] dump_stack_lvl (lib/dump_stack.c:107)
[  103.288366] print_report.cold (mm/kasan/report.c:314 mm/kasan/report.c:429)
[  103.288995] ? do_sys_openat2 (fs/open.c:1217 (discriminator 3))
[  103.289480] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
[  103.289889] ? do_sys_openat2 (fs/open.c:1217 (discriminator 3))
[  103.290218] do_sys_openat2 (fs/open.c:1217 (discriminator 3))
[  103.290554] ? file_open_root (fs/open.c:1199)
[  103.290978] do_sys_open (fs/open.c:1238)
[  103.291329] ? filp_open (fs/open.c:1238)
[  103.291658] ? fpregs_assert_state_consistent (arch/x86/kernel/fpu/context.h:39 arch/x86/kernel/fpu/core.c:772)
[  103.292056] ? __x64_sys_open (fs/open.c:1248 fs/open.c:1244 fs/open.c:1244)
[  103.292369] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)

……

相关推荐

  1. linux kernel物理内存概述(七)

    2024-03-11 16:00:06       26 阅读
  2. linux内存泄漏排查方法

    2024-03-11 16:00:06       10 阅读
  3. Linux 设备驱动管理之内核对象(Kernel Object)机制

    2024-03-11 16:00:06       14 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-03-11 16:00:06       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-03-11 16:00:06       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-03-11 16:00:06       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-03-11 16:00:06       20 阅读

热门阅读

  1. web3 DePIN赛道之OORT

    2024-03-11 16:00:06       19 阅读
  2. TCP/IP超全笔记 - TCP篇

    2024-03-11 16:00:06       22 阅读
  3. Django——路由

    2024-03-11 16:00:06       21 阅读
  4. 嵌入式学习day33

    2024-03-11 16:00:06       23 阅读
  5. 前端安全方面

    2024-03-11 16:00:06       25 阅读
  6. 列表解析扩展使用

    2024-03-11 16:00:06       20 阅读
  7. nginx配置缓存静态资源

    2024-03-11 16:00:06       24 阅读
  8. c++ vector使用

    2024-03-11 16:00:06       22 阅读
  9. python中的错误和异常

    2024-03-11 16:00:06       22 阅读
  10. 网络安全风险评估:详尽百项清单要点

    2024-03-11 16:00:06       20 阅读
  11. C++中的常量指针和指针常量

    2024-03-11 16:00:06       22 阅读
  12. 自动化运维工具----Ansible入门详解

    2024-03-11 16:00:06       23 阅读
  13. multiprocessing快速入门和总结

    2024-03-11 16:00:06       20 阅读