Buffer overflow attack is one of the most basic techniques used to exploit binaries which do not perform bound checks on the user’s input. It involves crafting the input to a program in a manner such that it overwrites adjacent memory locations and deliberately causes an unexpected behavior.

Typically this involves overwriting the return address saved on the stack and cause the program to redirect execution into another part of the code segment or into shellcode stored on the stack.

Consider the below function:

void bar()
{
	return;
}
void foo()
{
	bar();
}

int main()
{
	foo();
}

On compiling for ARM32 and disassembling it:

000103a0 <bar>:
   103a0:	b480      	push	{r7}
   103a2:	af00      	add	r7, sp, #0
   103a4:	bf00      	nop
   103a6:	46bd      	mov	sp, r7
   103a8:	f85d 7b04 	ldr.w	r7, [sp], #4
   103ac:	4770      	bx	lr

000103ae <foo>:
   103ae:	b580      	push	{r7, lr}
   103b0:	af00      	add	r7, sp, #0
   103b2:	f7ff fff5 	bl	103a0 <bar>
   103b6:	bf00      	nop
   103b8:	bd80      	pop	{r7, pc}

000103ba <main>:
   103ba:	b580      	push	{r7, lr}
   103bc:	af00      	add	r7, sp, #0
   103be:	f7ff fff6 	bl	103ae <foo>
   103c2:	2300      	movs	r3, #0
   103c4:	4618      	mov	r0, r3
   103c6:	bd80      	pop	{r7, pc}

In ARM32 register r7 is used as a frame pointer to keep track stack frames being created while lr link register is used to hold the return address following return from a function call.

In function foo the frame pointer pointing to the stack address of the previous function is pushed onto the stack with the link register in the current function’s stack frame.

The stack frame for foo would look something like below:

		-----------------
		|       R7      |
		-----------------
		|       LR      |
		-----------------
		|               |
		|   stack area  |
		|      for      |
		|   local vars  |   <----- SP
		-----------------

On similarly recompiling the code for ARM64 would give the below disassembly:

000000000040051c <bar>:
  40051c:	d503201f 	nop
  400520:	d65f03c0 	ret

0000000000400524 <foo>:
  400524:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  400528:	910003fd 	mov	x29, sp
  40052c:	97fffffc 	bl	40051c <bar>
  400530:	d503201f 	nop
  400534:	a8c17bfd 	ldp	x29, x30, [sp], #16
  400538:	d65f03c0 	ret

000000000040053c <main>:
  40053c:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  400540:	910003fd 	mov	x29, sp
  400544:	97fffff8 	bl	400524 <foo>
  400548:	52800000 	mov	w0, #0x0                   	// #0
  40054c:	a8c17bfd 	ldp	x29, x30, [sp], #16
  400550:	d65f03c0 	ret
  400554:	00000000 	.inst	0x00000000 ; undefined

In ARM64 x29 register serves the purpose of frame pointer while x30 is the link register. But there is a diffrence in the way these registers are stored in the current stack frame. The frame size is known at the start of the function. The stack pointer is reduced by this size and the pair of registers are stored at the bottom of the stack.

In this case stack frame for foo would look like:

		-----------------
		|               |
		|   stack area  |
		|      for      |
		|   local vars  |   		
		-----------------
		|       x29     |
		-----------------
		|       x30     |    <----- SP
		-----------------

This diffrence in the stack frame organization made me wonder if it is possible to corrupt the return address on the stack. A common standard function which should not be used due to lack of bound checks on the user input is gets. I will use this function to introduce an overflow vulnerability.

I’ll take the same program and invoke gets inside bar. The aim will be to redirect program flow to another function called target.

void target()
{
	printf("Success\n");
	exit(0);
}
void bar()
{
	char str[10];
	gets(str);
}
void foo()
{
	bar();
}

int main()
{
	foo();
}

Disassembly

00000004005fc <target>:
  4005fc:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  400600:	910003fd 	mov	x29, sp
  400604:	90000000 	adrp	x0, 400000 <_init-0x460>
  400608:	911c6000 	add	x0, x0, #0x718
  40060c:	97ffffb5 	bl	4004e0 <puts@plt>
  400610:	52800000 	mov	w0, #0x0                   	// #0
  400614:	97ffffa3 	bl	4004a0 <exit@plt>

0000000000400618 <bar>:
  400618:	a9be7bfd 	stp	x29, x30, [sp, #-32]!
  40061c:	910003fd 	mov	x29, sp
  400620:	910043a0 	add	x0, x29, #0x10
  400624:	97ffffb3 	bl	4004f0 <gets@plt>
  400628:	d503201f 	nop
  40062c:	a8c27bfd 	ldp	x29, x30, [sp], #32
  400630:	d65f03c0 	ret

0000000000400634 <foo>:
  400634:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  400638:	910003fd 	mov	x29, sp
  40063c:	97fffff7 	bl	400618 <bar>
  400640:	d503201f 	nop
  400644:	a8c17bfd 	ldp	x29, x30, [sp], #16
  400648:	d65f03c0 	ret

000000000040064c <main>:
  40064c:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  400650:	910003fd 	mov	x29, sp
  400654:	97fffff8 	bl	400634 <foo>
  400658:	52800000 	mov	w0, #0x0                   	// #0
  40065c:	a8c17bfd 	ldp	x29, x30, [sp], #16
  400660:	d65f03c0 	ret
  400664:	00000000 	.inst	0x00000000 ; undefined

Using GDB, I set breakpoints to stop and observe the stack contents.

The red outline shows the frame pointer and return address saved at bottom of current stack frame. Our aim is to corrupt the return address stored in the previous stack frame to redirect code execution to target function address 4005fc.

Using GDB we can inspect the stack contents to see if the overwrite was successful.

Though the str variable is declared as 10 bytes, the compiler allocates stack area in multiples of word size ( here 16 bytes is allocated) as this helps in making memory accesses faster. We can observe that previous frame’s return address has been corrupted and pointing to target. On continuing execution target is hit.

Conclusion

This subject is quite interesting for me due to the fact that I have tried buffer overflow only on x86 and ARM32 and in both return address would be saved at the beginning of the stack frame making it possible to corrupt the current frame’s return address.

In ARM64 we cannot corrupt the current frame’s return address or rather it does not seem straight forward to me. I was able to corrupt the previous frame’s return address which I never thought was possible.

So to sum up, in ARM64 the return address is stored at the bottom of the current stack frame. This was specified in the ARM reference manual under the subtopic of function calls and stack creation. I wonder if this can be done for the ARM 32 bit architecture similarly by modifying the compiler.

If anyone is an expert in this topic, let me know your thoughts in the comments.