Since the x86 has evolved over three decades during which it had to respond to different »market forces« while maintaining backward compatiblity, the x86 assembly language is charged with an immense legacy:
A branching instruction changes the eip (instruction pointer) register if the given conditions are met.
The call instruction jumps uncoditionally.
It stores the eip on the stack so that the callee can return to the caller.
push / pos
pushfirst decrements(!) the stack pointer esp, then places the pushed value to the memory location pointed at with esp.
popfirst loads the indicated destination with the value that esp points at, then increments esp.
Stack frame operations
The stack pointeresp stores the top (bottom) of the stack.
The base pointer (ebp) stores the value of the stack pointer when the current function was entered.
Thus, the first instructions of a function are commonly:
pushl %ebp ; Save base pointer of calling function
movl %esp, %ebp ; Load base pointer with current stack pointer
subl $40, %esp ; Make room for a few local variables.
After setting up the stackframe like this, various values can be accessed via the base pointer (whose value does not change in the context of the function):
ebp+ 0: the previous value of the ebp register.
ebp+ 4: the return address.
ebp+ 8: the first parameter.
ebp+12: the second parameter, etc.
ebp- 4: the first local variable.
ebp- 8: the second local variable, etc.
A function is then left with
leave ; FIRST set esp = ebp THEN pop ebp.
ret : esp points now to return address: pop it into eip.
leave is equivalent to:
mov esp, ebp
pop ebp
Sections
Sections group portions of code and data which have similar purpose or should have the same memory permissions.
Common names:
.text: code, never to be paged out.
.data: read/write (global variables)
.rdata: read only data (e.g. strings)
.bss: block storage start (or block started by symbol). Uninitialized data (only size of objects is specified). The .bss section seems to be merged into the .data section by the linker. Since it contains unintialized data, it helps to reduce the size of the object file and is »expanded« into memory when the executable is loaded.
.reloc: relocation information, used to modify addresses.
.idata: import address table. (Seems to be merged into with .text or .rdata).
.edata: export information
.rsrc: ressources
PAGE*: pagable code. Apparently mainly used for kernel drivers.
Unneeded seections can be disposed of with strip.
Data for operands
The data for an operand can be stored
in an immediate (const value?)
register
memory location
The address of a memory location can be calculated by base + index*scale + displacement.
HLA: High Level Assembler, uses a high level language like syntax for declarations of functions and procedure calls and allows for control structures (if, while …).
As high level assembler, it requires a (real) low level assembler such as as or masm.
First instruction of an x86
An x86 begins execution with the instruction stored in ffff:fff0, aka reset vector.
Apparently, there is a jump to the BIOS start routine: