Usage

The call stack is implemented by the CPU and is used to save the state of the calling procedure, pass parameters to the called procedure, and store local variables for the currently executing procedure

  • Return address is saved on the stack to support nested functions (e.g., recursion)
  • On x86-64, up to 6 arguments are passed by registers; others are passed on the stack
  • Local variables are saved on the stack if no registers are left or if the address of the variable needs to be taken

The stack grows toward lower addresses, and the heap grows towards higher addresses, allowing the use of as much space as possible before a collision

The stack allows abstracting the subroutine implementation from the caller. Registers can be mutated by the subroutine, so without the stack, the caller has to prepare and be aware of subroutine interworking

stack frame pointer.png|300

Kernel and User Stack

Kernel space and user space each have their own distinct call stacks. The OS switches between them when transferring control between kernel and user space, e.g. via interrupts. This stack switching is done to prevent privileged procedures from crashing due to insufficient stack space. It also prevents less privileged procedures from interfering with more privileged procedures by sharing the same stack

Stack Registers

The stack pointer (contained in RSP register) stores the address of the last stack frame - the top of the stack (lowest address)

The stack-frame base pointer (contained in RBP register) identifies a fixed reference point within the stack frame for the called procedure. On x86-64 RBP is used only when the stack frame can be of variable size. On i386, most compilers always used EBP, but recent versions of GCC have dropped this convention. The old RBP is saved because it’s a callee-saved register

stack no frame pointer.png|300

Stack Manipulation Instructions

Items are placed on the stack using the PUSH instruction and removed from the stack using the POP instruction:

  • PUSH decrements the stack pointer (contained in the RSP register), and copies the source operand to the top of stack
  • POP copies the value at the current top of stack (indicated by the RBP register) to the location specified with the destination operand. It then increments the RBP register to point to the new top of stack

Calling and Return from Procedure

The CALL instruction is used to transfer control to the procedure:

  1. Pushes the current value of the RIP register on the stack
  2. Loads the offset of the called procedure in the RIP register
  3. Begins execution of the called procedure

The RET instruction is used to return from the called procedure:

  1. Pops the top-of-stack value (the return instruction pointer) into the RIP register
  2. Resumes execution of the calling procedure

By convention:

  • Once a function completes, the stack returns to the state it was in prior to the function call
  • The return value from a subroutine is stored in a register
  • Parameters that are passed on the stack are in reverse order. This allows sensible order - later parameters have bigger offsets from RBP. It also allows access to fixed arguments (e.g., the number of arguments passed in) when using variadic functions
  • Old data on the stack from the previous call is usually not cleaned

Stack Memory Management

TODO Elaborate

By default Linux user space stacks use 8Mb of virtual address space, divided into 4Kb physical pages. Those pages are allocated lazily, so in reality only a subset of 8Mb address space is used

There is a protected guard page at the end of the stack address space which causes a trap and process termination on access, which prevents stack overflow

  1. The OS allocates 8MB of virtual memory for a stack by setting up the MMU’s page tables for a thread. This requires very little RAM to hold the page table entries only
  2. When a thread runs and tries to access a virtual address on the stack that doesn’t have a physical page assigned to it yet, a page fault exception is triggered by the MMU
  3. The CPU core responds to the page fault exception by switching to a privileged execution mode (which has its own stack) and calling the page fault exception handler function inside the kernel
  4. The kernel allocates a page of physical RAM to that virtual memory page and returns back to the user space thread

References