Results 1 to 6 of 6

Thread: Understanding the Stack --The assembly Language Perspective Share/Save - My123World.Com!

  1. #1
    Garage Member
    Join Date
    Sep 2010
    Location
    Chennai
    Posts
    83
    Blog Entries
    1

    Understanding the Stack --The assembly Language Perspective

    Understanding the Stack --The assembly Language Perspective


    By Sebas Sujeen aka "0x90" and Sridhar aka "phr3ak"

    For developing and understanding the exploit development process, it is imperative to understand the working of the stack from the assembly language perspective too. This article will make you understand how stack works from this perspective. To get the most from this article, we would advise you to refer the Assembly Language Primer for Hackers by Vivek Ramachandran(http://www.securitytube.net) before reading this

    Code:
    An example C Program 
    
    #include<stdio.h>
    main()
    {
    int i;
    for(i=0;i<10;i++)
    printf("Hello");
    }
    compile and create an assembly file using the '-S' switch in gcc

    Code:
    $gcc -mpreferred-stack-boundary=2 -fno-stack-protector -z execstack -o test.s test.c -S
    Note we are giving the preferred stack boundary as 2^2 (2 refers to the power of 2 ie 4 bytes) to simplify the debug process. And if you are using latest versions of Linux, you can turn off stack-protector and also disable the NX(non executable stack) by using '-fno-stack-protector' and '-z execstack' respectively.

    Below is the listing of the test.s file..
    Note that the instructions are numbered only for convenience for referring it further..
    Code:
    .LC0:
    	.string	"Hello" -->defines a string "Hello" with label .LC0 similar to variable in C
    .text       --> This is the text section, actual code goes here. This section is read only and any attempt to write into this section will result in Segmentation fault  
    .globl main --> start of the main function, in the Assembly Language primer this is given as _start: , since we are assembling and linking in the same  step using gcc and since gcc recognizes only main, we give it as main 
    	.type	main, @function -->defines that main is a function 
    main:                       --> start of main function
    	pushl	%ebp  -->1
    	movl	%esp, %ebp -->2
    	subl	$8, %esp -->3
    	movl	$0, -4(%ebp) -->4
    	jmp	.L2  -->5
    .L3:
    	movl	$.LC0, %eax -->6
    	movl	%eax, (%esp) -->7
    	call	printf  -->8
    	addl	$1, -4(%ebp) -->9
    .L2:
    	cmpl	$9, -4(%ebp) -->10
    	jle	.L3 -->11
    	leave-->12
    	ret -->13
    Ok, here we go!!! The code is explained here!!
    Code:
     
    pushl %ebp     
    movl %esp,%ebp
    subl $8,%esp
    These 3 statements constitute the function prologue.Refer the assembly language primer for more details about function prologue. But one interesting thing to note is that, if main is the first function to be executed , where is it returning to?? Keep that thought in mind, we will get to it when we debug the program. Also we subtract esp by 8 bytes to allocate space for the local variable 'i' and the address of "Hello" which printf() needs.

    Lets debug it by firing up gdb..

    before that , lets create the executable file
    Code:
    $gcc -o test test.s -ggdb
    The '-ggdb' switch makes debugging symbols to be loaded into the executable for analysis by the debugger..
    Then lets fire up gdb..
    Code:
    $gdb ./test -q
    '-q' is for quiet mode...
    list command is used to list the disassembled code
    Code:
     (gdb) list
    1  		.file	"first.c"
    2		.section	.rodata
    3	.LC0:
    4		.string	"Hello"
    5		.text
    6	.globl main
    7		.type	main, @function
    8	main:
    9		pushl	%ebp
    10		movl	%esp, %ebp
    (gdb) 
    11		subl	$8, %esp
    12		movl	$0, -4(%ebp)
    13		jmp	.L2
    14	.L3:
    15		movl	$.LC0, %eax
    16		movl	%eax, (%esp)
    17		call	printf
    18		addl	$1, -4(%ebp)
    19	.L2:
    20		cmpl	$9, -4(%ebp)
    (gdb) 
    21 	 jle	.L3
    22		leave
    23		ret
    24		.size	main, .-main
    25		.ident	"GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    26		.section	.note.GNU-stack,"",@progbits
    set the break point at the appropriate location, lets set the break point at the first instruction after main function (ie) in this case line no 9
    Code:
    (gdb)break 9
     This will set the break point at line no 9 and now if you run the program , the program will pause the execution at line no 9.
     (gdb)run
     this will run the program and the execution will halt at the break point
     Analyse the esp register which points to the top of the stack
     (gdb)i r esp 
     this will display the address "pointed to" by esp register
     in our case ,"0xbffff44c"
     (gdb)s
     This will step the execution to the next instruction, so our instruction pushl %ebp is executed, now wat would have happened?? After a push instruction the stack pointer esp decrements by 4 bytes since the stack grows down in memory from higher to lower address ( 4 bytes since we use 32 bit processor)
     (gdb)i r ebp
     this will display the address "pointed to" by ebp , in our case " 0xbffff4c8"
     (gdb) i r esp
     this will display "0xbffff448" , note the decrement of 4 bytes from the previous value
     (gdb)s
     Now the current instruction is (2).. The implication of this instruction is to make the ebp to point to the new stack frame. 
     (gdb)i r esp
     (gdb)i r ebp
     both will display "0xbffff448"
     (gdb)s
     Now the instruction is (3), this will decrement the address "pointed to" by esp by 8 bytes to allocate space for local variable "i" and address to "hello" required by printf()
     (gdb) i r esp
     This will display "0xbffff440"
     (gdb)s
     The next instruction is (4)..Before that a few words, the ebp will be used to refer to the variables using an offset, now you can argue that why not use esp for the same purpose, but since esp's value changes with each push and pop it will require additional overhead to calculate the offset each time so we use ebp whose value doesnt change within a stack frame. Of the 8 bytes allocated , the first 4 bytes will be for "i" and the next 4 bytes to hold the address of "Hello". movl $0,-4(%ebp) , this is the indirect addressing mode. the () acts as a dereferencing operator , similar to '*' used with pointers in C. so what this instruction will do is that, it will 0 to address pointed to by (ebp)-4. Using the debugger will make it clear..
     (gdb)i r ebp
     gives "0xbffff448"
     (gdb)print $ebp-4
     This means 'get the address "pointed to" by ebp and subtract 4 from the address'
     ie 0xbffff444
     Now there is a way to examine the memory using the 'x' in gdb.. Refer the assembly language primer to more knw about the examine command
     (gdb)x/1xw $ebp-4
     This will examine 1 word (in hex)	at address 0xbffff444
     this will return 0x00000000
     (gdb)s
     The next insruction is "cmpl $9, -4(%ebp)"
     ie it will compare 0 with 9 , basically as 0-9 but wont reflect the result back to -4(%ebp). This will obviously return a negative value
     You can check that by referring the EFLAGS register and you can see that the Sign Flag is set (this will be set only if the result of an arithmetic expression returns a negative value)
     (gdb)i r eflags
     [ CF AF SF IF ID ] --> Note sign flag is set
     (gdb)s
     the next instruction is jle .L3.. this is the conditional jump statement. This will jump to .L3 if the result of the previous instruction is negative or zero(ie if the SF or ZF is set)
     (gdb)s
     the next instruction is "movl $.LC0, %eax"
     This will move the address of "Hello" into eax register
     (gdb)i r eax
     this returns "0x80484d0"
     Now examine the memory at this address
     (gdb)x/1s 0x80484d0
     This will return '0x80484d0:	 "Hello"'
     (gdb)s
     the next instruction is call printf(obvious what it does!!)
     There are some things to remember when you call a  function within an assembly program.
     Assume the function call in C is like this : fn(1,2)
     This is interpreted like this in assembly language
     pushl $2
     pushl $1
     call fn
     Note the arguments are pushed in reverse order coz stack is a LIFO structure so the argument pushed last comes out first!!
     (gdb)s
     the next instruction is addl $1,-4(%ebp)
     this will increment the value of i by 1
     And the process gets repeated until i <=9, thereby printing "Hello" ten times on the screen
     Set a break point at line 22 (leave) and continue
     (gdb)break 22
     (gdb)cont
     This will continue the execution till 'leave' instruction is encountered and pauses there
     The leave instruction will clear up the space allocated for local vars and args to function..
     This is equivalent to "movl %ebp,%esp"
                            "popl %ebp" this will pop the saved frame pointer so that the function that "called" this function can resume
     The last instruction "ret"  will put the saved EIP value into the eip register so that the execution can resume from the function which "called" the main function
    
     Atlast after the ret instruction is executed, if u step with gdb
     (gdb)s
     you can see this , 0x00144bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
    This means that main is called from __libc_start_main() and returns to that..
    Hope it helps!!! Upcoming papers will be about exploit development!!(mainly Linux)..

    Greetz : fb1h2s, Team SG and all g4h members!!!

    feedback is welcome , whatever it may be!!!

    References: Smashing the stack for fun and profit (Aleph1), Hacking the art of exploitation 2nd Edition, http://www.securitytube.net
    Code:
    Food for thought:
    Try disassembling this and play with it!!!
    #include<stdio.h>
    int mul(int x,int y)
    {
    return x*y;
    }
    main()
    {
    int a,b,c;
    scanf("%d %d",&a,&b);
    c=mul(a,b);
    }
    his is a bit complicated considering the previous program!!But if you are able to understand the disassembled code...Then you have a greater clarity in understanding the stack...Gud Luck....Until next time..\m/ Peace Out

  2. #2
    Security Researcher fb1h2s's Avatar
    Join Date
    Jul 2010
    Location
    India
    Posts
    616
    Blog Entries
    32
    Hey thanks for the share and good job, one suggestion would be if u could add a digram of the instructions in the stack that would make it better. Give it a better understanding for the readers.
    Hacking Is a Matter of Time Knowledge and Patience

  3. #3
    Garage Member
    Join Date
    Sep 2010
    Location
    Chennai
    Posts
    83
    Blog Entries
    1
    Quote Originally Posted by fb1h2s View Post
    Hey thanks for the share and good job, one suggestion would be if u could add a digram of the instructions in the stack that would make it better. Give it a better understanding for the readers.
    Thankx man!! Yeah i will provide the diagrams with my next article...

  4. #4
    thank u a lot i'm agree with friends diagrams can make it full

  5. #5
    hello sebas..nice tut. I have a problem. When i try to access the memory address of ESI or EDI in a simple assembly program via GDB. GDB gives error:- "value cant be converted to integer". can u help out in this?

  6. #6
    Garage Member
    Join Date
    Sep 2010
    Location
    Chennai
    Posts
    83
    Blog Entries
    1
    Quote Originally Posted by marc_kriss View Post
    hello sebas..nice tut. I have a problem. When i try to access the memory address of ESI or EDI in a simple assembly program via GDB. GDB gives error:- "value cant be converted to integer". can u help out in this?
    Can you please post your code?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •