mit_os_lab1(Continuing)

MIT_OS_lab1

小记

前两周面了华为和腾讯的安全岗,嗯,面得和屎一样,另外趋势和360也笔试就挂了,长亭今天邮件发来说我作为实习生可能都不够格,说到底是自己太菜了,都问到了操作系统和内核的问题,都不会。。。这方面知识太缺乏了,应该是各个方面的知识都太缺乏了,下定决心要潜心学习了,暑假看到过这个MIT的课程但是当时没有坚持下来,毕竟是英文的,而且自己还很菜,现在不打算去搞秋招了,把时间用来专心地研究!加油!

前奏

从MIT下载的JOS系统刚开始make后,make qemu一直不成功,出现这样的错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
root@kali:~/os/jos# make qemu
qemu-system-i386 -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::25000 -D qemu.log
EAX=00000000 EBX=00000000 ECX=000001a9 EDX=00000000
ESI=00000000 EDI=f0113000 EBP=f010ffc8 ESP=f010ffbc
EIP=f01015e8 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= 00007c4c 00000017
IDT= 00000000 000003ff
CR0=80010011 CR2=00000040 CR3=00112000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
EFER=0000000000000000
Triple fault. Halting for inspection via QEMU monitor.

模拟器的画面如下
JOS_error_screen
在另一终端同一文件夹我们使用gdb来打印错误信息

1
2
3
4
5
6
7
8
9
10
11
root@kali:~/os/jos# gdb -q
+ target remote localhost:25000
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
The target architecture is assumed to be i386
=> 0xf01015e8: Error while running hook_stop:
Cannot access memory at address 0xf01015e8
0xf01015e8 in ?? ()
+ symbol-file obj/kern/kernel
gdb-peda$ x/i 0xf01015e8
=> 0xf01015e8 <memset+73>: Cannot access memory at address 0xf01015e8

可以看到系统运行停止的位置eip为0xf01015e8,x/i显示是memset函数,这样我们找到函数出错的地方了,那么我们在memset函数入口设置断点,查看是哪一个指令出现错误,上面系统已经终止运行,所以无法获取内存中的内容,重新make qemu-gdb,gdb,这样会在boot程序入口设置断点:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
root@kali:~/os/jos# gdb -q
+ target remote localhost:25000
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
warning: A handler for the OS ABI "GNU/Linux" is not built into this configuration
of GDB. Attempting to continue with the default i8086 settings.
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: jmp 0xf000:0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
gdb-peda$ b memset
Breakpoint 1 at 0xf010159f: file lib/string.c, line 123.
gdb-peda$ c
Continuing.
The target architecture is assumed to be i386
=> 0xf010159f <memset>: push ebp
Breakpoint 1, memset (v=0xf0111300, c=0x0, n=0x23a4) at lib/string.c:123
123 {
gdb-peda$ disassemble memset
Dump of assembler code for function memset:
=> 0xf010159f <+0>: push ebp
0xf01015a0 <+1>: mov ebp,esp
0xf01015a2 <+3>: push edi
0xf01015a3 <+4>: push esi
0xf01015a4 <+5>: push ebx
0xf01015a5 <+6>: mov edi,DWORD PTR [ebp+0x8]
0xf01015a8 <+9>: mov ecx,DWORD PTR [ebp+0x10]
0xf01015ab <+12>: mov eax,edi
0xf01015ad <+14>: test ecx,ecx
0xf01015af <+16>: je 0xf01015c6 <memset+39>
0xf01015b1 <+18>: test edi,0x3
0xf01015b7 <+24>: jne 0xf01015be <memset+31>
0xf01015b9 <+26>: test cl,0x3
0xf01015bc <+29>: je 0xf01015cb <memset+44>
0xf01015be <+31>: mov eax,DWORD PTR [ebp+0xc]
0xf01015c1 <+34>: cld
0xf01015c2 <+35>: rep stos BYTE PTR es:[edi],al
0xf01015c4 <+37>: mov eax,edi
0xf01015c6 <+39>: pop ebx
0xf01015c7 <+40>: pop esi
0xf01015c8 <+41>: pop edi
0xf01015c9 <+42>: pop ebp
0xf01015ca <+43>: ret
0xf01015cb <+44>: movzx edx,BYTE PTR [ebp+0xc]
0xf01015cf <+48>: mov eax,edx
0xf01015d1 <+50>: shl eax,0x8
0xf01015d4 <+53>: mov ebx,edx
0xf01015d6 <+55>: shl ebx,0x18
0xf01015d9 <+58>: mov esi,edx
0xf01015db <+60>: shl esi,0x10
0xf01015de <+63>: or ebx,esi
0xf01015e0 <+65>: or edx,ebx
0xf01015e2 <+67>: shr ecx,0x2
0xf01015e5 <+70>: or eax,edx
0xf01015e7 <+72>: cld
0xf01015e8 <+73>: rep stos DWORD PTR es:[edi],eax
0xf01015ea <+75>: mov eax,edi
0xf01015ec <+77>: jmp 0xf01015c6 <memset+39>
End of assembler dump.

可以看到发生错误时eip为 0xf01015e8 <+73>: rep stos DWORD PTR es:[edi],eax,这个指令适用于在edi位置存储ecx个eax,也就是说从edi开始到edi到edi+ecx的内存的每个DWORD全部被覆盖成eax,我们在这个地方设置断点看看,刚开始是否可以运行,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
gdb-peda$ b *0xf01015e8
Breakpoint 1 at 0xf01015e8: file lib/string.c, line 131.
gdb-peda$ c
Continuing.
The target architecture is assumed to be i386
=> 0xf01015e8 <memset+73>: rep stos DWORD PTR es:[edi],eax
Breakpoint 1, 0xf01015e8 in memset (v=0xf0111300, c=0x0, n=0x23a4) at lib/string.c:131
131 asm volatile("cld; rep stosl\n"
gdb-peda$ define xx
Type commands for definition of "xx".
End with a line saying just "end".
>si
>x/5i $eip
>i r
>end
gdb-peda$ xx
=> 0xf01015e8 <memset+73>: rep stos DWORD PTR es:[edi],eax
Breakpoint 1, 0xf01015e8 in memset (v=0xf0111300, c=0x0, n=0x23a4) at lib/string.c:131
131 asm volatile("cld; rep stosl\n"
=> 0xf01015e8 <memset+73>: rep stos DWORD PTR es:[edi],eax
0xf01015ea <memset+75>: mov eax,edi
0xf01015ec <memset+77>: jmp 0xf01015c6 <memset+39>
0xf01015ee <memmove>: push ebp
0xf01015ef <memmove+1>: mov ebp,esp
eax 0x0 0x0
ecx 0x8e8 0x8e8
edx 0x0 0x0
ebx 0x0 0x0
esp 0xf010ffbc 0xf010ffbc
ebp 0xf010ffc8 0xf010ffc8
esi 0x0 0x0
edi 0xf0111304 0xf0111304
eip 0xf01015e8 0xf01015e8 <memset+73>
eflags 0x46 [ PF ZF ]
cs 0x8 0x8
ss 0x10 0x10
ds 0x10 0x10
es 0x10 0x10
fs 0x10 0x10
gs 0x10 0x10

可以看到这个指令可以运行2次了,那么我们就不步进了,直接continue到错误发生地点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
gdb-peda$ disable 1
gdb-peda$ c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
=> 0xf01015e8 <memset+73>: Error while running hook_stop:
Cannot access memory at address 0xf01015e8
0xf01015e8 in memset (v=<error reading variable: Cannot access memory at address 0xf010ffc0>,
c=<error reading variable: Cannot access memory at address 0xf010ffc4>, n=<error reading variable: Cannot access memory at address 0xf010ffc8>) at lib/string.c:131
131 asm volatile("cld; rep stosl\n"
gdb-peda$ i r
eax 0x0 0x0
ecx 0x1a9 0x1a9
edx 0x0 0x0
ebx 0x0 0x0
esp 0xf010ffbc 0xf010ffbc
ebp 0xf010ffc8 0xf010ffc8
esi 0x0 0x0
edi 0xf0113000 0xf0113000
eip 0xf01015e8 0xf01015e8 <memset+73>
eflags 0x46 [ PF ZF ]
cs 0x8 0x8
ss 0x10 0x10
ds 0x10 0x10
es 0x10 0x10
fs 0x10 0x10
gs 0x10 0x10
gdb-peda$ x/10wx 0xf0113000
0xf0113000 <charcode>: Cannot access memory at address 0xf0113000

可以看到这里,程序终止时ecx=0x1a9,edi=0xf0113000,ecx并没有变成0,也就是说还需要继续覆盖,但是这里终止了,由于操作系统可以访问任意地址,不存在无法读的问题,覆盖行为是写操作,导致终止的原因就很明显了,那就是向只读(readonly)区域执行了写操作,可以看到0xf0113000是charcode的内存,同时gdb告知了错误指令在lib/string.c:131,那么我们去查看string.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
121 void *
122 memset(void *v, int c, size_t n)
123 {
124 char *p;
125
126 if (n == 0)
127 return v;
128 if ((int)v%4 == 0 && n%4 == 0) {
129 c &= 0xFF;
130 c = (c<<24)|(c<<16)|(c<<8)|c;
131 asm volatile("cld; rep stosl\n"
132 :: "D" (v), "a" (c), "c" (n/4)
133 : "cc", "memory");
134 } else
135 asm volatile("cld; rep stosb\n"
136 :: "D" (v), "a" (c), "c" (n)
137 : "cc", "memory");
138 return v;
139 }

这里发生错误的原因在于传入的地址v是错误的,那么我们就要找到是哪个函数传入了这个参数,使用grep

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
root@kali:~/os/jos# grep -R memset *
inc/string.h:void * memset(void *dst, int c, size_t len);
kern/init.c: memset(edata, 0, end - edata);
lib/string.c:// Using assembly for memset/memmove
lib/string.c:memset(void *v, int c, size_t n)
lib/string.c:memset(void *v, int c, size_t n)
Binary file obj/kern/kernel.img matches
obj/kern/kernel.asm: memset(edata, 0, end - edata);
obj/kern/kernel.asm:f01000ca: e8 d0 14 00 00 call f010159f <memset>
obj/kern/kernel.asm:f010159f <memset>:
obj/kern/kernel.asm:memset(void *v, int c, size_t n)
obj/kern/kernel.asm:f01015af: 74 15 je f01015c6 <memset+0x27>
obj/kern/kernel.asm:f01015b7: 75 05 jne f01015be <memset+0x1f>
obj/kern/kernel.asm:f01015bc: 74 0d je f01015cb <memset+0x2c>
obj/kern/kernel.asm:f01015ec: eb d8 jmp f01015c6 <memset+0x27>
Binary file obj/kern/string.o matches
obj/kern/kernel.sym:f010159f T memset
Binary file obj/kern/init.o matches
Binary file obj/kern/kernel matches
————————————————————————————————————————————
make时的记录
+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 390 bytes (max 510)
+ mk obj/kern/kernel.img

程序都有一个entry入口,整个内核的执行顺序也和这个make的编译顺序差不多,结合上面的搜索,我们把memset错误参数的传入的位置初步确定在init.c,查看源代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
22 void
23 i386_init(void)
24 {
25 extern char edata[], end[];
26
27 // Before doing anything else, complete the ELF loading process.
28 // Clear the uninitialized global data (BSS) section of our program.
29 // This ensures that all static/global variables start out zero.
30 memset(edata, 0, end - edata);
31
32 // Initialize the console.
33 // Can't call cprintf until after we do this!
34 cons_init();
35
36 cprintf("6828 decimal is %o octal!\n", 6828);
37
38 // Test the stack backtrace function (lab 1 only)
39 test_backtrace(5);
40
41 // Drop into the kernel monitor.
42 while (1)
43 monitor(NULL);
44 }

结合之前发生错误时的截屏中没有打印

1
2
3
4
5
6
7
8
9
10
11
12
13
6828 decimal is XXX octal!
entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5

所以我们把传入错误参数的位置最终确定在init.c i386_init(void)这个函数中的memset(edata, 0, end - edata),接下来就要找这edata在哪里定义,是extern变量,那么我们用grep继续查找

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
root@kali:~/os/jos# grep -R edata *
kern/kernel.ld: PROVIDE(edata = .);
kern/monitor.c: extern char _start[], entry[], etext[], edata[], end[];
kern/monitor.c: cprintf(" edata %08x (virt) %08x (phys)\n", edata, edata - KERNBASE);
kern/init.c: extern char edata[], end[];
kern/init.c: memset(edata, 0, end - edata);
Binary file obj/kern/kernel.img matches
obj/kern/kernel.asm: extern char edata[], end[];
obj/kern/kernel.asm: memset(edata, 0, end - edata);
obj/kern/kernel.asm: extern char _start[], entry[], etext[], edata[], end[];
obj/kern/kernel.asm: cprintf(" edata %08x (virt) %08x (phys)\n", edata, edata - KERNBASE);
obj/kern/kernel.sym:f0111300 D edata
Binary file obj/kern/monitor.o matches
Binary file obj/kern/init.o matches
Binary file obj/kern/kernel matches
Binary file obj/boot/boot.out matches

其他地方都是引用,只有在kern/kernel.ld: PROVIDE(edata = .);出现了类似定义的字样,赶紧去学一会链接器的知识,发现这个语法确实是定义。结合前面i386_init函数memset前的三句注释,我们可以得知这个edata是bss段的开始地址,end是bss段的结束位置。

1
2
3
27 // Before doing anything else, complete the ELF loading process.
28 // Clear the uninitialized global data (BSS) section of our program.
29 // This ensures that all static/global variables start out zero.

打开kern/kernel.ld,找到定义该变量的位置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
42 /* Adjust the address for the data segment to the next page */
43 . = ALIGN(0x1000);
44
45 /* The data segment */
46 .data : {
47 *(.data)
48 }
49
50 PROVIDE(edata = .);
51
52 .bss : {
53 *(.bss)
54 }
55
56 PROVIDE(end = .);

好像是没什么问题这个定义,我又去xv6-os的ld文件看了一下,也是这样定义的,那问题处在哪呢?我们可以看一眼最终可执行文件的段的地址,objdump

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
root@kali:~/os/jos# objdump -h ./obj/kern/kernel
./obj/kern/kernel: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 000019d9 f0100000 00100000 00001000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 000006c0 f01019e0 001019e0 000029e0 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .stab 00003bb9 f01020a0 001020a0 000030a0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .stabstr 00001958 f0105c59 00105c59 00006c59 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .data 00009300 f0108000 00108000 00009000 2**12
CONTENTS, ALLOC, LOAD, DATA
5 .got 00000008 f0111300 00111300 00012300 2**2
CONTENTS, ALLOC, LOAD, DATA
6 .got.plt 0000000c f0111308 00111308 00012308 2**2
CONTENTS, ALLOC, LOAD, DATA
7 .data.rel.local 00001000 f0112000 00112000 00013000 2**12
CONTENTS, ALLOC, LOAD, DATA
8 .data.rel.ro.local 00000044 f0113000 00113000 00014000 2**2
CONTENTS, ALLOC, LOAD, DATA
9 .bss 00000644 f0113060 00113060 00014044 2**5
ALLOC
10 .comment 0000001c 00000000 00000000 00014044 2**0
CONTENTS, READONLY

可以看到0xf0113000是.data.rel.ro.local这个段的地址,这个段是用来于重定位的,并设置了readonly标志,显然edata不是bss段的开始,那么要修改它变成bss的开始地址(在xv6-os中的edata是可用的原因是xv6编译时完全重载,所以没有这个重定位的段got,plt等等,data段直接与bss段相连)。

1
2
3
4
5
6
7
8
9
10
11
12
45 /* The data segment */
46 .data : {
47 *(.data)
48 }
49
50
51 .bss : {
52 edata = .;
53 *(.bss)
54 }
55
56 PROVIDE(end = .);

把edata放入.bss定义中,总算成功了,不知道这样做规范不规范,whatever,it works.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
t@kali:~/os/jos# make clean
rm -rf obj .gdbinit jos.in qemu.log
root@kali:~/os/jos# make
+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 390 bytes (max 510)
+ mk obj/kern/kernel.img
root@kali:~/os/jos# make qemu
sed "s/localhost:1234/localhost:25000/" < .gdbinit.tmpl > .gdbinit
qemu-system-i386 -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::25000 -D qemu.log
6828 decimal is XXX octal!
entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>

一点刚学习到的内容
内联汇编1
内联汇编2
ld链接器语法1
ld链接器语法2