为什么下面的代码在运行时没有任何崩溃?
Why does the code below work without any crash @ runtime ?
而且大小完全取决于机器/平台/编译器!!.我什至可以在 64 位机器上提供多达 200 个.如何在 OS 中检测到 main 函数中的分段错误?
And also the size is completely dependent on machine/platform/compiler!!. I can even give upto 200 in a 64-bit machine. how would a segmentation fault in main function get detected in the OS?
int main(int argc, char* argv[])
{
int arr[3];
arr[4] = 99;
}
这个缓冲空间是从哪里来的?这是分配给进程的栈吗?
Where does this buffer space come from? Is this the stack allocated to a process ?
前段时间为了教育目的而写的东西...
Something I wrote sometime ago for education-purposes...
考虑以下 c 程序:
int q[200];
main(void) {
int i;
for(i=0;i<2000;i++) {
q[i]=i;
}
}
编译并执行后,会产生核心转储:
after compiling it and executing it, a core dump is produced:
$ gcc -ggdb3 segfault.c
$ ulimit -c unlimited
$ ./a.out
Segmentation fault (core dumped)
现在使用 gdb 执行事后分析:
now using gdb to perform a post mortem analysis:
$ gdb -q ./a.out core
Program terminated with signal 11, Segmentation fault.
[New process 7221]
#0 0x080483b4 in main () at s.c:8
8 q[i]=i;
(gdb) p i
$1 = 1008
(gdb)
咦,在分配的200个项目之外写的时候程序没有segfault,而是在i=1008的时候就崩溃了,为什么?
huh, the program didn’t segfault when one wrote outside the 200 items allocated, instead it crashed when i=1008, why?
进入页面.
在 UNIX/Linux 上可以通过多种方式确定页面大小,一种方法是使用系统函数 sysconf(),如下所示:
One can determine the page size in several ways on UNIX/Linux, one way is to use the system function sysconf() like this:
#include <stdio.h>
#include <unistd.h> // sysconf(3)
int main(void) {
printf("The page size for this system is %ld bytes.
",
sysconf(_SC_PAGESIZE));
return 0;
}
给出输出:
此系统的页面大小为 4096 字节.
The page size for this system is 4096 bytes.
或者可以像这样使用命令行实用程序getconf:
or one can use the commandline utility getconf like this:
$ getconf PAGESIZE
4096
验尸
事实证明,段错误不是发生在 i=200 而是发生在 i=1008,让我们找出原因.启动 gdb 做一些事后分析:
It turns out that the segfault occurs not at i=200 but at i=1008, lets figure out why. Start gdb to do some post mortem ananlysis:
$gdb -q ./a.out core
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
[New process 4605]
#0 0x080483b4 in main () at seg.c:6
6 q[i]=i;
(gdb) p i
$1 = 1008
(gdb) p &q
$2 = (int (*)[200]) 0x804a040
(gdb) p &q[199]
$3 = (int *) 0x804a35c
q 在地址 0x804a35c 处结束,或者更确切地说,q[199] 的最后一个字节位于该位置.页面大小是我们之前看到的 4096 字节,机器的 32 位字大小将虚拟地址分解为 20 位页码和 12 位偏移量.
q ended at at address 0x804a35c, or rather, the last byte of q[199] was at that location. The page size is as we saw earlier 4096 bytes and the 32-bit word size of the machine gives that an virtual address breaks down into a 20-bit page number and a 12-bit offset.
q[] 以虚拟页码结尾:
q[] ended in virtual page number:
0x804a = 32842偏移量:
0x804a = 32842 offset:
0x35c = 860所以还是有的:
0x35c = 860 so there were still:
4096 - 864 = 3232分配 q[] 的内存页上剩余的字节数.该空间可以容纳:
4096 - 864 = 3232 bytes left on that page of memory on which q[] was allocated. That space can hold:
3232/4 = 808整数,并且代码将其视为在位置 200 到 1008 处包含 q 的元素.
3232 / 4 = 808 integers, and the code treated it as if it contained elements of q at position 200 to 1008.
我们都知道这些元素不存在,编译器没有抱怨,硬件也没有抱怨,因为我们对该页面有写权限.只有当 i=1008 时 q[] 引用了我们没有写权限的不同页面上的地址时,虚拟内存 hw 才检测到这一点并触发了段错误.
We all know that those elements don’t exists and the compiler didn’t complain, neither did the hw since we have write permissions to that page. Only when i=1008 did q[] refer to an address on a different page for which we didn’t have write permission, the virtual memory hw detected this and triggered a segfault.
一个整数存储在 4 个字节中,这意味着该页面包含 808 (3236/4) 个额外的假元素,这意味着从 q[200]、q[201] 一直访问这些元素仍然是完全合法的到元素 199+808=1007 (q[1007]) 而不触发 seg 故障.访问 q[1008] 时,您会进入一个权限不同的新页面.
An integer is stored in 4 bytes, meaning that this page contains 808 (3236/4) additional fake elements meaning that it is still perfectly legal to access these elements from q[200], q[201] all the way up to element 199+808=1007 (q[1007]) without triggering a seg fault. When accessing q[1008] you enter a new page for which the permission are different.