Stuff in ARM Instruction

This article would study ARM instruction set and how special instruction works and how to implement the function call.

Thumb VS ARM

In this section, I will compare ARM instructions and Thumb instructions. First of all, I will present the code analysed.

1
2
3
4
5
6
7
#include<stdio.h>

int main(void){
int i;
for (i=0; i < 10; i++);
return 0;
}

And I utilized the cross-compile tools in Ubuntu to compile the code.

1
2
arm-linux-gnueabi-gcc test.c -o test.a
arm-linux-gnueabi-gcc test.c -mthumb -o test.t

Secondly, use the objdump tool in cross-compile kit to inspect the assemble code.

1
2
arm-linux-gnueabi-objdump -d test.a
arm-linux-gnueabi-objdump -d test.t

ARM code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0000840c <main>:
840c: e52db004 push {fp} ; (str fp, [sp, #-4]!)
8410: e28db000 add fp, sp, #0
8414: e24dd00c sub sp, sp, #12
8418: e3a03000 mov r3, #0
841c: e50b3008 str r3, [fp, #-8]
8420: ea000002 b 8430 <main+0x24>
8424: e51b3008 ldr r3, [fp, #-8]
8428: e2833001 add r3, r3, #1
842c: e50b3008 str r3, [fp, #-8]
8430: e51b3008 ldr r3, [fp, #-8]
8434: e3530009 cmp r3, #9
8438: dafffff9 ble 8424 <main+0x18>
843c: e3a03000 mov r3, #0
8440: e1a00003 mov r0, r3
8444: e28bd000 add sp, fp, #0
8448: e8bd0800 ldmfd sp!, {fp}
844c: e12fff1e bx lr

As you can see, all instruction compose of 32 bits.

Thumb code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
0000840c <main>:
840c: b580 push {r7, lr}
840e: b082 sub sp, #8
8410: af00 add r7, sp, #0
8412: 2300 movs r3, #0
8414: 607b str r3, [r7, #4]
8416: e002 b.n 841e <main+0x12>
8418: 687b ldr r3, [r7, #4]
841a: 3301 adds r3, #1
841c: 607b str r3, [r7, #4]
841e: 687b ldr r3, [r7, #4]
8420: 2b09 cmp r3, #9
8422: ddf9 ble.n 8418 <main+0xc>
8424: 2300 movs r3, #0
8426: 1c18 adds r0, r3, #0
8428: 46bd mov sp, r7
842a: b002 add sp, #8
842c: bd80 pop {r7, pc}
842e: 46c0 nop ; (mov r8, r8)

The instruction consist of 16 bits as our wish. Compare with ARM instructions, this code is only half length. But the number of instructions is similar. That means similar execution time but much shorter program length.

1
2
-rwxrwxr-x 1 sea sea 8346 Mar 31 07:08 test.a
-rwxrwxr-x 1 sea sea 8349 Mar 31 07:08 test.t

However, the Thumb code executable is bit longer. Of course, the main part of a executable is other code which is prepared for operating system, that is ARM instructions.

Condition Instruction

In this section I will present how to construct the condition execution instructions. Because the compiler would generate normal instruction as a default approach, the optimization option must be galvanized. Of course, the target code must be demonstrated at first.

1
2
3
4
5
6
7
8
9
10
11
#include<stdio.h>

int main(void){
int a, b;
scanf("%d %d", &a, &b);
if (a>b)
a++;
else
b++;
printf("%d %d", a, b);
}

And the compile script.

1
arm-linux-gnueabi-gcc test.c -o test.a -O

Now we could see the condition instructions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
0000849c <main>:
849c: e52de004 push {lr} ; (str lr, [sp, #-4]!)
84a0: e24dd00c sub sp, sp, #12
84a4: e59f0040 ldr r0, [pc, #64] ; 84ec <main+0x50>
84a8: e1a0100d mov r1, sp
84ac: e28d2004 add r2, sp, #4
84b0: ebffffa6 bl 8350 <_init+0x44>
84b4: e59d2000 ldr r2, [sp]
84b8: e59d3004 ldr r3, [sp, #4]
84bc: e1520003 cmp r2, r3
84c0: c2822001 addgt r2, r2, #1
84c4: c58d2000 strgt r2, [sp]
84c8: d2833001 addle r3, r3, #1
84cc: d58d3004 strle r3, [sp, #4]
84d0: e3a00001 mov r0, #1
84d4: e59f1010 ldr r1, [pc, #16] ; 84ec <main+0x50>
84d8: e59d2000 ldr r2, [sp]
84dc: e59d3004 ldr r3, [sp, #4]
84e0: ebffff97 bl 8344 <_init+0x38>
84e4: e28dd00c add sp, sp, #12
84e8: e8bd8000 ldmfd sp!, {pc}
84ec: 00008564 .word 0x00008564

Register Shift

Let’s construct scenario that would generate register shift instruction. Target code is here.

1
2
3
4
5
6
7
int main(void){
int a, b;
scanf("%d %d", &a, &b);
a = a + b << 4;
a = a + b << 4;
printf("%d %d", a, b);
}

Of course, in order to avoid the compiler generate normal instruction we must use the optimization option. However, the optimization would eliminate every code because of the unuse of variable. Thus, input and output is introduced.

1
arm-linux-gnueabi-gcc test.c -o test.a -O

Dump it, check the machine code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
0000849c <main>:
849c: e92d4010 push {r4, lr}
84a0: e24dd008 sub sp, sp, #8
84a4: e59f4038 ldr r4, [pc, #56] ; 84e4 <main+0x48>
84a8: e1a00004 mov r0, r4
84ac: e1a0100d mov r1, sp
84b0: e28d2004 add r2, sp, #4
84b4: ebffffa5 bl 8350 <_init+0x44>
84b8: e59d3004 ldr r3, [sp, #4]
84bc: e59d2000 ldr r2, [sp]
84c0: e0832002 add r2, r3, r2
84c4: e0832202 add r2, r3, r2, lsl #4
84c8: e1a02202 lsl r2, r2, #4
84cc: e58d2000 str r2, [sp]
84d0: e3a00001 mov r0, #1
84d4: e1a01004 mov r1, r4
84d8: ebffff99 bl 8344 <_init+0x38>
84dc: e28dd008 add sp, sp, #8
84e0: e8bd8010 pop {r4, pc}
84e4: 0000855c .word 0x0000855c

Load a 32-bits number

Now, I will show how the assigment is implemented.

1
2
3
4
5
6
7
8
#include<stdio.h>

int main(void){
int a, b;
scanf("%d %d", &a, &b);
a=0x12312312;
printf("%d %d", a, b);
}

Variable int a would be assigned by a 32-bits number. Compile it, and check the assemble language.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
00008494 <main>:
8494: e92d4800 push {fp, lr}
8498: e28db004 add fp, sp, #4
849c: e24dd008 sub sp, sp, #8
84a0: e24b200c sub r2, fp, #12
84a4: e24b3008 sub r3, fp, #8
84a8: e59f0034 ldr r0, [pc, #52] ; 84e4 <main+0x50>
84ac: e1a01002 mov r1, r2
84b0: e1a02003 mov r2, r3
84b4: ebffffa3 bl 8348 <_init+0x44>
84b8: e59f3028 ldr r3, [pc, #40] ; 84e8 <main+0x54>
84bc: e50b300c str r3, [fp, #-12]
84c0: e51b200c ldr r2, [fp, #-12]
84c4: e51b3008 ldr r3, [fp, #-8]
84c8: e59f0014 ldr r0, [pc, #20] ; 84e4 <main+0x50>
84cc: e1a01002 mov r1, r2
84d0: e1a02003 mov r2, r3
84d4: ebffff92 bl 8324 <_init+0x20>
84d8: e1a00003 mov r0, r3
84dc: e24bd004 sub sp, fp, #4
84e0: e8bd8800 pop {fp, pc}
84e4: 00008560 .word 0x00008560
84e8: 12312312 .word 0x12312312

As you can see, the constant is stored in bottom of code. And the content in the memory would be load into register directly by using instruction ldr.

Function Call

Function call is a important part in every program. So that, I will study the function call mechanism in this section.
Let’s present the target code as ususal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include<stdio.h>

int x(int a){
a=0x440;
printf("%d", a);
}

int g(int a, int b){
b=0x220;
b = b+1;
x(b);
b = b+1;
}

int f(int a){
int b=0x130;
b = b+1;
g(b, a);
b = b+1;
}

int main(void){
int a, b;
scanf("%d %d", &a, &b);
a=0x12312312;
f(a);
printf("%d %d", a, b);
}

Then, compile it and inspect the machine code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
00008510 <f>:
8510: e92d4800 push {fp, lr}
8514: e28db004 add fp, sp, #4
8518: e24dd010 sub sp, sp, #16
851c: e50b0010 str r0, [fp, #-16]
8520: e3a03e13 mov r3, #304 ; 0x130
8524: e50b3008 str r3, [fp, #-8]
8528: e51b3008 ldr r3, [fp, #-8]
852c: e2833001 add r3, r3, #1
8530: e50b3008 str r3, [fp, #-8]
8534: e51b0008 ldr r0, [fp, #-8]
8538: e51b1010 ldr r1, [fp, #-16]
853c: ebffffe1 bl 84c8 <g>
8540: e51b3008 ldr r3, [fp, #-8]
8544: e2833001 add r3, r3, #1
8548: e50b3008 str r3, [fp, #-8]
854c: e1a00003 mov r0, r3
8550: e24bd004 sub sp, fp, #4
8554: e8bd8800 pop {fp, pc}

00008558 <main>:
8558: e92d4800 push {fp, lr}
855c: e28db004 add fp, sp, #4
8560: e24dd008 sub sp, sp, #8
8564: e24b200c sub r2, fp, #12
8568: e24b3008 sub r3, fp, #8
856c: e59f0040 ldr r0, [pc, #64] ; 85b4 <main+0x5c>
8570: e1a01002 mov r1, r2
8574: e1a02003 mov r2, r3
8578: ebffff72 bl 8348 <_init+0x44>
857c: e59f3034 ldr r3, [pc, #52] ; 85b8 <main+0x60>
8580: e50b300c str r3, [fp, #-12]
8584: e51b300c ldr r3, [fp, #-12]
8588: e1a00003 mov r0, r3
858c: ebffffdf bl 8510 <f>
8590: e51b200c ldr r2, [fp, #-12]
8594: e51b3008 ldr r3, [fp, #-8]
8598: e59f0014 ldr r0, [pc, #20] ; 85b4 <main+0x5c>
859c: e1a01002 mov r1, r2
85a0: e1a02003 mov r2, r3
85a4: ebffff5e bl 8324 <_init+0x20>
85a8: e1a00003 mov r0, r3
85ac: e24bd004 sub sp, fp, #4
85b0: e8bd8800 pop {fp, pc}
85b4: 00008634 .word 0x00008634
85b8: 12312312 .word 0x12312312

00008494 <x>:
8494: e92d4800 push {fp, lr}
8498: e28db004 add fp, sp, #4
849c: e24dd010 sub sp, sp, #16
84a0: e50b0010 str r0, [fp, #-16]
84a4: e3a03d11 mov r3, #1088 ; 0x440
84a8: e50b3008 str r3, [fp, #-8]
84ac: e59f0010 ldr r0, [pc, #16] ; 84c4 <x+0x30>
84b0: e51b1008 ldr r1, [fp, #-8]
84b4: ebffff9a bl 8324 <_init+0x20>
84b8: e1a00003 mov r0, r3
84bc: e24bd004 sub sp, fp, #4
84c0: e8bd8800 pop {fp, pc}
84c4: 00008630 .word 0x00008630

000084c8 <g>:
84c8: e92d4800 push {fp, lr}
84cc: e28db004 add fp, sp, #4
84d0: e24dd010 sub sp, sp, #16
84d4: e50b0010 str r0, [fp, #-16]
84d8: e50b1014 str r1, [fp, #-20]
84dc: e3a03e22 mov r3, #544 ; 0x220
84e0: e50b3008 str r3, [fp, #-8]
84e4: e51b3008 ldr r3, [fp, #-8]
84e8: e2833001 add r3, r3, #1
84ec: e50b3008 str r3, [fp, #-8]
84f0: e51b0008 ldr r0, [fp, #-8]
84f4: ebffffe6 bl 8494 <x>
84f8: e51b3008 ldr r3, [fp, #-8]
84fc: e2833001 add r3, r3, #1
8500: e50b3008 str r3, [fp, #-8]
8504: e1a00003 mov r0, r3
8508: e24bd004 sub sp, fp, #4
850c: e8bd8800 pop {fp, pc}

Return Address

The return address would be stored in register lr when the instruction bl is executed. And whenever the call is invoked, the return address would be pushed into stack.

Parameter Passing

The parameter is passed through register R0 and R1.

Local Variable

The variable would be allocated when the function call is made. First of all, the local stack space woudl be allocated by frame pointer. After the return address is pushed to the stack, the local variable would stored sequentially.

Reigster Save

R0-R3 is saved by caller but the register above R4 is saved by callee.

BIC

In order to obtin the BIC instruction, we must do some operation that is similiar to the logic of this instruction.

1
2
3
4
5
6
7
8
#include<stdio.h>

int main(void){
int a, b, c, d;
scanf("%d %d %d %d", &a, &b, &c, &d);
a = b & (~c);
printf("%d %d %d %d", a, b, c, d);
}

And, the optimazation option is necessary.

1
arm-linux-gnueabi-gcc test.c -o test.a -O

Then, we could find out the BIC instruction.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
0000849c <main>:
849c: e92d4010 push {r4, lr}
84a0: e24dd018 sub sp, sp, #24
84a4: e59f4048 ldr r4, [pc, #72] ; 84f4 <main+0x58>
84a8: e28d3014 add r3, sp, #20
84ac: e58d3000 str r3, [sp]
84b0: e1a00004 mov r0, r4
84b4: e28d1008 add r1, sp, #8
84b8: e28d200c add r2, sp, #12
84bc: e28d3010 add r3, sp, #16
84c0: ebffffa2 bl 8350 <_init+0x44>
84c4: e59d1010 ldr r1, [sp, #16]
84c8: e59d300c ldr r3, [sp, #12]
84cc: e1c32001 bic r2, r3, r1
84d0: e58d2008 str r2, [sp, #8]
84d4: e58d1000 str r1, [sp]
84d8: e59d1014 ldr r1, [sp, #20]
84dc: e58d1004 str r1, [sp, #4]
84e0: e3a00001 mov r0, #1
84e4: e1a01004 mov r1, r4
84e8: ebffff95 bl 8344 <_init+0x38>
84ec: e28dd018 add sp, sp, #24
84f0: e8bd8010 pop {r4, pc}
84f4: 0000856c .word 0x0000856c

Inline Assembly

In this section, I write a pure ARM assembly function. There is one thing deserved to be mentioned, which is we must store and load the return address, so that the function would be executed properly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include<stdio.h>

asm(
"test:"
"push {r3, r4, fp, lr}\n"
"add r3, r0, r1\n"
"mov r4, #0\n"
"strb r4, [r3]\n"
"bl puts\n"
"pop {r3, r4, fp, pc}\n"
);

int main(void){
char s[10]="abcdefghij";
test(s, 3);
}

And I invoke the function puts to print the string.