項目的代碼中出現的一個問題,問題的表現是,在一個函數中使用到了變長數組,而對超過這個數組范圍的一個賦值,導致了數組首地址為空.
我把這個問題抽出來形成了一個示例函數,在i386下也出現類似的問題,代碼如下:
#include <stdio.h>
int test(int n)
{
char *arg[n + 4];
printf("before:arg = %p\n", &arg[0]);
arg[16] = NULL;
printf("after:arg = %p\n", &arg[0]);
return 0;
}
int main()
{
test(2);
return 0;
}
這段代碼在i386平台下面,執行完"arg[16] = NULL"語句之後,再打印arg的首地址,顯示為NULL
使用gdb跟蹤這個問題:
(gdb) b test
Breakpoint 1 at 0x804835b: file test.c, line 4.
(gdb) display /i $pc
(gdb) run
Starting program: /home/lichuang/test/a.out
Breakpoint 1, test (n=2) at test.c:4
4 {
1: x/i $pc 0x804835b <test+7>: mov %esp,%eax
(gdb) si
0x0804835d 4 {
1: x/i $pc 0x804835d <test+9>: mov %eax,%ebx
(gdb)
5 char *arg[n + 4];
1: x/i $pc 0x804835f <test+11>: mov 0x8(%ebp),%eax
(gdb)
0x08048362 5 char *arg[n + 4];
1: x/i $pc 0x8048362 <test+14>: add $0x4,%eax
(gdb)
0x08048365 5 char *arg[n + 4];
1: x/i $pc 0x8048365 <test+17>: shl $0x2,%eax
(gdb)
0x08048368 5 char *arg[n + 4];
1: x/i $pc 0x8048368 <test+20>: add $0xf,%eax
(gdb)
0x0804836b 5 char *arg[n + 4];
1: x/i $pc 0x804836b <test+23>: add $0xf,%eax
(gdb)
0x0804836e 5 char *arg[n + 4];
1: x/i $pc 0x804836e <test+26>: shr $0x4,%eax
(gdb)
0x08048371 5 char *arg[n + 4];
1: x/i $pc 0x8048371 <test+29>: shl $0x4,%eax
(gdb)
0x08048374 5 char *arg[n + 4];
1: x/i $pc 0x8048374 <test+32>: sub %eax,%esp
(gdb)
0x08048376 5 char *arg[n + 4];
1: x/i $pc 0x8048376 <test+34>: lea 0x8(%esp),%eax
(gdb)
0x0804837a 5 char *arg[n + 4];
1: x/i $pc 0x804837a <test+38>: mov %eax,0xffffffe8(%ebp)
(gdb)
0x0804837d 5 char *arg[n + 4];
1: x/i $pc 0x804837d <test+41>: mov 0xffffffe8(%ebp),%eax
(gdb)
0x08048380 5 char *arg[n + 4];
1: x/i $pc 0x8048380 <test+44>: add $0xf,%eax
(gdb)
0x08048383 5 char *arg[n + 4];
1: x/i $pc 0x8048383 <test+47>: shr $0x4,%eax
(gdb)
0x08048386 5 char *arg[n + 4];
1: x/i $pc 0x8048386 <test+50>: shl $0x4,%eax
(gdb)
0x08048389 5 char *arg[n + 4];
1: x/i $pc 0x8048389 <test+53>: mov %eax,0xffffffe8(%ebp)
(gdb)
0x0804838c 5 char *arg[n + 4];
1: x/i $pc 0x804838c <test+56>: mov 0xffffffe8(%ebp),%eax
(gdb)
0x0804838f 5 char *arg[n + 4];
1: x/i $pc 0x804838f <test+59>: mov %eax,0xfffffff8(%ebp)
(gdb)
7 printf("before:arg = %p\n", &arg[0]);
1: x/i $pc 0x8048392 <test+62>: mov 0xfffffff8(%ebp),%eax
上面是使用gdb跟蹤匯編代碼顯示的結果,可以看到,在定義變長數組arg[n + 4]的時候,執行了很多語句,秘密都在這些匯編代碼裡面了,把這個程序用objdump -d命令反匯編出來,抽出上面的那部分匯編代碼查看:
804835b: 89 e0 mov %esp,%eax
804835d: 89 c3 mov %eax,%ebx
804835f: 8b 45 08 mov 0x8(%ebp),%eax
8048362: 83 c0 04 add $0x4,%eax
8048365: c1 e0 02 shl $0x2,%eax
8048368: 83 c0 0f add $0xf,%eax
804836b: 83 c0 0f add $0xf,%eax
804836e: c1 e8 04 shr $0x4,%eax
8048371: c1 e0 04 shl $0x4,%eax
8048374: 29 c4 sub %eax,%esp
8048376: 8d 44 24 08 lea 0x8(%esp),%eax
804837a: 89 45 e8 mov %eax,0xffffffe8(%ebp)
804837d: 8b 45 e8 mov 0xffffffe8(%ebp),%eax
8048380: 83 c0 0f add $0xf,%eax
8048383: c1 e8 04 shr $0x4,%eax
8048386: c1 e0 04 shl $0x4,%eax
8048389: 89 45 e8 mov %eax,0xffffffe8(%ebp)
804838c: 8b 45 e8 mov 0xffffffe8(%ebp),%eax
804838f: 89 45 f8 mov %eax,0xfffffff8(%ebp)
8048392: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
逐句進行分析如下:
804835b: 89 e0 mov %esp,%eax
804835d: 89 c3 mov %eax,%ebx
將esp寄存器地址通過eax保存到ebx寄存器中
804835f: 8b 45 08 mov 0x8(%ebp),%eax
8048362: 83 c0 04 add $0x4,%eax
首先獲得傳入test函數的參數n的值(在內存地址為ebp+8的位置),再將它的值加上4,也就得到了數組arg[n+4]的元素數量
8048365: c1 e0 02 shl $0x2,%eax
8048368: 83 c0 0f add $0xf,%eax
804836b: 83 c0 0f add $0xf,%eax
804836e: c1 e8 04 shr $0x4,%eax
8048371: c1 e0 04 shl $0x4,%eax
首先將前面得到的元素數量左移兩位(shl 0x2),也就是乘以4,4是sizeof(char*)的大小,於是得到了char* arg[n+4]所容納元素的空間大小.之後兩次加上0xf,然後又右移4位左移4位的原因是,編譯器要將這個大小按照16來對齊,而又要留夠足夠的空間,所以前面兩次加上0xf.在上面幾個操作完成之後,eax裡面的值就是可以容納char* arg[n+4]的按照16對齊的數據
8048374: 29 c4 sub %eax,%esp
根據前面得到的eax值調整esp指針,也就是在test函數的棧幀地址的低位置留出了足夠容納arg數組的空間. 注意到,esp值已經在最開始保存到ebx寄存器中了,所以,在test函數的結束位置,還要使用ebx寄存器恢復esp寄存器.
8048376: 8d 44 24 08 lea 0x8(%esp),%eax
804837a: 89 45 e8 mov %eax,0xffffffe8(%ebp)
804837d: 8b 45 e8 mov 0xffffffe8(%ebp),%eax
8048380: 83 c0 0f add $0xf,%eax
8048383: c1 e8 04 shr $0x4,%eax
8048386: c1 e0 04 shl $0x4,%eax
8048389: 89 45 e8 mov %eax,0xffffffe8(%ebp)
804838c: 8b 45 e8 mov 0xffffffe8(%ebp),%eax
804838f: 89 45 f8 mov %eax,0xfffffff8(%ebp)