CPU Privilege Levels
The CPU uses privilege levels to prevent a program operating at a lesser privilege level from accessing a segment with a greater privilege level, except under controlled situations. When the CPU detects a privilege level violation, it generates a General Protection Fault
The x86-64 has four privilege levels (protection rings), numbered from 0 to 3, each having its own stack. Most OSs, however, use only two levels: level 0 for kernel mode and level 3 for user mode, with kernel and user stack respectively
Two bits in the code segment register CS
indicate the current privilege level (CPL) of the program
Privileged Registers
This registers can be accessed or modified only in privileged mode (privilege level 0). Most notable are:
RFLAGS
register. Contains system flags, most notable are:- Interrupt enable flag
IF
- Controls the response of the CPU to maskable hardware interrupts - I/O privilege level field
IOPL
- Indicates the I/O privilege level (IOPL) of the currently running program
- Interrupt enable flag
- Interrupt Descriptor Table Register
IDTR
- holds the base address for the Interrupt Descriptor Table (IDT) - Control registers:
CRO
- Controls operating mode of the CPUCR2
- Contains the address from where execution will resume when a page fault occursCR3
- Contains an address pointing to the root of memory page directory, which is used to walk page tables to locate a memory page
- Model-specific registers (MSRs). Control the debug extensions, the performance-monitoring counters, the machine-check architecture, and the memory type ranges
Privileged Instructions
Privileged instructions are instructions that can only be executed at CPL of 0 (privilege level 0)
Most notable instructions:
LIDT
- Load IDT registerCLI
/STI
- Clear and setIF
flag
Interrupts and Exceptions
There are generally three classes of interrupts on most platforms:
- A hardware interrupt is an asynchronous event triggered by interrupt request (IRQ) sent from external hardware device, e.g. an O device or Network adapter
- A software interrupt is an asynchronous event triggered by application executing the
INT n
instruction - An exception is a synchronous event that is generated when the CPU detects predefined conditions while executing an instruction. Exceptions are further classified as faults, traps, and aborts
When an interrupt is received or an exception is detected, the currently running procedure is suspended while the CPU executes an interrupt or exception handler, also known as an interrupt service routine (ISR). The CPU accesses the handler through an entry in the interrupt descriptor table (IDT). When execution of the handler is complete, depending on the type of the handler:
- Interrupt handler returns to the instruction after the interrupting instruction
- Trap handler returns to the instruction after the trapping instruction
- Fault handler returns to the faulting instruction, thereby re-executing it. If handler is not able to correct the fault condition, it returns to an
abort
routine in the kernel that terminates the application - Abort handler returns to an
abort
routine that terminates the application
Interrupt Enable Flag and Non-Maskable Interrupts
There are situations when handler should not be interrupted, usually in some kind of a critical section
The interrupt enable flag IF
in the RFLAGS
register controls whether maskable hardware interrupts are served by the CPU. This flag can be set using STI
instruction and cleared using CLI
instruction
Software interrupts generated with INT n
instruction cannot be masked by the IF
flag
The non-maskable interrupt (NMI) is a special interrupt than cannot be masked, and is often a result of a critical hardware failure. When the CPU receives a NMI, it handles it immediately by calling the NMI handler pointed to by vector 2 in the IDT. While an NMI handler is executing, the CPU blocks delivery of other interrupts, including NMI interrupts, until the next execution of the IRET
instruction
Interrupt Descriptor Table (IDT)
On x86-64 each interrupt and exception are assigned a number, called a vector. The x86-64 CPU supports 256 vectors from 0 to 255:
- Vectors 0-31 are x86-64 defined exceptions and NMI interrupt. For example:
- Vector 0: Divide Error
- Vector 1: Debug Exception
- Vector 2: Non-maskable external interrupt (NMI)
- Vector 3: Breakpoint, generated by
INT3
instruction - Vector 4: Overflow, generated by
INT0
instruction - Vector 13: General Protection Fault, e.g. segmentation fault
- Vector 14: Page Fault
- Vectors 32-255 are defined by the OS kernel:
The x86-64 uses the interrupt descriptor table (IDT) which associates each vector with a gate descriptor for the interrupt/exception handler. IDT is an array of 256 descriptors, one for each vector
The IDT resides in RAM and its address is stored in the IDTR
register. At system boot time OS sets up IDT using LIDT
and SIDT
instructions:
LIDT
— Loads the IDT address and limit from memory into theIDTR
registerSIDT
— Stores the IDT base address and limit from theIDTR
register into memory
At run time, when an interrupt or exception occurs, CPU retrieves the corresponding gate descriptor for the vector, and uses it to invoke the handler
IDT Gate Descriptors
Code in lower privilege level can only access code operating at higher privilege level by means of a tightly controlled and protected interface called a gate. Attempts to access higher privilege levels without going through a gate and without having sufficient access rights causes a General Protection Fault
The IDT may contain either an interrupt-gate descriptor or a trap-gate descriptor. If an interrupt or exception handler is called through an interrupt gate, the CPU clears the IF
flag to prevent subsequent interrupts from interfering with the execution of the handler. When a handler is called through a trap gate, the state of the IF
flag is not changed
A gate descriptor is 16-bytes long and contains both a handler address and a descriptor privilege level (DPL) field, used to control access rights. The DPL field defines the CPU privilege levels which are allowed to access this gate via the INT n
instruction. This restriction prevents programs running at user level from using a software interrupt to access critical exception handlers, such as the Page Fault handler, providing that those handlers are placed in kernel space. For hardware-generated interrupts and processor-detected exceptions, the CPU ignores the DPL of interrupt and trap gates
Calling and Returning from Exceptions and Interrupts
A call to an interrupt/exception handler is similar to a procedure call. The CPU performs the following steps:
- Temporarily saves (internally) the current contents of the
RSP
,RFLAGS
andRIP
registers - Loads the stack pointer for the kernel stack into
RSP
register and switches to the kernel stack - Pushes the temporarily saved
RSP
,RFLAGS
,RIP
values onto the kernel stack - Loads the new instruction pointer from the gate into the
RIP
register - If the call is through an interrupt gate, clears the
IF
flag in theRFLAGS
register - Begins execution of the handler procedure in the kernel mode
The IRET
instruction is used to return from the handler. The CPU performs the following steps:
- Performs a privilege check
- Restores
RIP
andRFLAGS
registers from the kernel stack - Restores
RSP
register, resulting in a switch back to the user stack - Resumes execution of the interrupted procedure in the user mode
Advanced Programmable Interrupt Controller (APIC)
Local APIC (LAPIC)
The Local Advanced Programmable Interrupt Controller (LAPIC) performs two primary functions:
- It receives interrupts from the CPU interrupt pins, internal sources and an external I/O APIC and sends these to the CPU core for handling
- It sends and receives interprocessor interrupt (IPI) messages between logical CPUs on the system bus
It receives interrupts from the following sources:
- Locally connected O Devices: Devices connected directly to the CPU’s local interrupt pins
LINT0
andLINT1
- APIC timer
- Externally connected O Devices: Devices connected to the interrupt input pins of an I/O APIC
- IPIs: Allows one CPU to interrupt another CPU or group of CPUs on the system bus
Interrupts from LINT0
, LINT1
and the APIC timer are delivered based on configurations set up through a group of memory-mapped registers called Local Vector Table (LVT). Each local interrupt source has a dedicated register (table entry) that specifies its interrupt vector number and other setup information
Interrupts from externally connected I/O devices and IPIs are handled by IPI messaging. In multi-processor systems, LINT0
and LINT1
are typically unused, as all interrupts are routed through the I/O APIC using IPI
The Interrupt Command Register (ICR) is used by the LAPIC to send IPIs, specifying the target CPUs and interrupt vector number
I/O APIC
The I/O APIC primary function is to receive external interrupts from the system and I/O devices and distribute them to the LAPICs of selected CPUs on the system bus
Interrupts are delivered based on configurations set up through a group of memory-mapped registers called redirection table. Each IRQ has a dedicated register (table entry) specifying its interrupt vector number, destination LAPIC and other setup information, which is used to route the interrupt from the external device to the LAPICs
Detection
You can find all of the APICs on a system (both local and IO APICs) by parsing the MADT
Message Signaled Interrupts (MSI)
Message signaled interrupts (MSI) allows the PCI device to write an interrupt-describing data to a special O address, and the chipset then delivers the corresponding interrupt to a CPU
MSI provides the following benefits:
- Eliminates the need for a dedicated interrupt line for each device
- Supports a larger number of devices without requiring additional physical interrupt lines
- Improves performance
- Allows each device to signal interrupts independently, avoiding conflicts that can arise when multiple devices share the same interrupt line
References
- Computer Systems A Programmer’s Perspective, Global Edition (3rd ed). Randal E. Bryant, David R. O’Hallaron
- CMU 15-213: Exceptional Control Flow: Exceptions and Processes
- Caltech CS24: Virtualization - YouTube
- Intel 64 and IA-32 Architectures Software Developer’s Manual
- Interrupts — The Linux Kernel documentation
- cs.montana.edu/courses/spring2005/518/Hypertextbook/jim/media/interrupts_on_linux.pdf
- Message Signaled Interrupts - Wikipedia
- Advanced Programmable Interrupt Controller - Wikipedia
- Inter-processor interrupt - Wikipedia
- IOAPIC - OSDev Wiki